Opened 9 months ago

Closed 3 months ago

Last modified 2 months ago

#27587 closed defect (fixed)

Follow up to #25680: Fix tempfile handling in doctests

Reported by: embray Owned by:
Priority: major Milestone: sage-8.9
Component: python3 Keywords: py3, cygwin
Cc: jdemeyer, chapoton Merged in:
Authors: Erik Bray Reviewers: Frédéric Chapoton, Samuel Lelièvre
Report Upstream: N/A Work issues:
Branch: 32e8ff2 (Commits) Commit:
Dependencies: Stopgaps:

Change History (17)

comment:1 Changed 9 months ago by embray

  • Branch set to u/embray/ticket-27587
  • Cc jdemeyer chapoton added
  • Commit set to b124a898c7315412a071a8ccc7028604192f3af8
  • Status changed from new to needs_review

New commits:

9d2c0c4Upgrade to Python 3.7.3
438c3f4Disable collections.abc warnings
e62ddf8Fix "abort pool operation" docbuild doctest
0ad1dc1Mark a test for deprecated functionality as Python 2 only
5a09cf1Mark a few tests as known bug (#27537)
b124a89Trac #25680: Fix doctest worker output tempfile handling on Cygwin

comment:2 Changed 5 months ago by embray

  • Milestone changed from sage-8.8 to sage-8.9

Moving tickets from the Sage 8.8 milestone that have been actively worked on in the last six months to the next release milestone (optimistically).

comment:3 Changed 4 months ago by git

  • Commit changed from b124a898c7315412a071a8ccc7028604192f3af8 to 32e8ff2ba881004a67320c862ec004ba46abdc1e

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

32e8ff2Trac #25680: Fix doctest worker output tempfile handling on Cygwin

comment:4 Changed 4 months ago by chapoton

  • Reviewers set to Frédéric Chapoton

Looks good. You can set to positive if the bot comes back green.

comment:5 Changed 4 months ago by slelievre

  • Keywords py3 cygwin added
  • Reviewers changed from Frédéric Chapoton to Frédéric Chapoton, Samuel Lelièvre
  • Status changed from needs_review to positive_review

Before this, test would not run on py3-based Sage on Cygwin in Windows 7. Now they do.

Positive review.

comment:6 Changed 3 months ago by vbraun

  • Branch changed from u/embray/ticket-27587 to 32e8ff2ba881004a67320c862ec004ba46abdc1e
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:7 Changed 3 months ago by jhpalmieri

  • Commit 32e8ff2ba881004a67320c862ec004ba46abdc1e deleted

This ticket seems to be causing a sporadic failure in numerical/linear_tensor_element.pyx. With 8.9.beta8, that file passed tests 10 times in a row. With 8.9.beta8 + this ticket, that file failed tests 7 out of 10 times. This is with a Python 3 build on OS X. Any idea what's going on?

$ ./sage -t src/sage/numerical/linear_tensor_element.pyx 
too many failed tests, not using stored timings
Running doctests with ID 2019-09-03-15-46-57-f7f0f8e7.
Git branch: t/27587/32e8ff2ba881004a67320c862ec004ba46abdc1e
Using --optional=build,dochtml,meataxe,python2,sage
Doctesting 1 file.
glp_free: memory allocation error
Error detected in file env/alloc.c at line 72
------------------------------------------------------------------------
0   signals.cpython-37m-darwin.so       0x0000000103b600ba print_backtrace + 58
1   signals.cpython-37m-darwin.so       0x0000000103b643f7 sigdie + 39
2   signals.cpython-37m-darwin.so       0x0000000103b64370 sigdie_for_sig + 256
3   libsystem_platform.dylib            0x00007fff667b1b5d _sigtramp + 29
4   ???                                 0x0000000101f62000 0x0 + 4327874560
5   libsystem_c.dylib                   0x00007fff6666b6a6 abort + 127
6   libglpk.40.dylib                    0x0000000147d2fe44 errfunc + 212
7   libglpk.40.dylib                    0x0000000147d2f638 dma + 184
8   libglpk.40.dylib                    0x0000000147d36a3f _glp_dmp_delete_pool + 47
9   libglpk.40.dylib                    0x0000000147d09f51 delete_prob + 17
10  libglpk.40.dylib                    0x0000000147d0a002 glp_delete_prob + 66
11  glpk_backend.cpython-37m-darwin.so  0x0000000147c7840c __pyx_tp_dealloc_4sage_9numerical_8backends_12glpk_backend_GLPKBackend + 76
12  mip.cpython-37m-darwin.so           0x00000001443b56fe __pyx_tp_clear_4sage_9numerical_3mip_MixedIntegerLinearProgram + 78
13  libpython3.7m.dylib                 0x00000001020bb727 collect + 2327
14  libpython3.7m.dylib                 0x00000001020bbe72 _PyObject_GC_Alloc + 386
15  libpython3.7m.dylib                 0x00000001020bbee4 _PyObject_GC_New + 20
16  libpython3.7m.dylib                 0x0000000101fbac08 dictiter_new + 24
17  libpython3.7m.dylib                 0x0000000101f73248 PyObject_GetIter + 24
18  libpython3.7m.dylib                 0x00000001020e36eb defdict_reduce + 107
19  libpython3.7m.dylib                 0x0000000101f896a4 _PyMethodDef_RawFastCallDict + 580
20  libpython3.7m.dylib                 0x0000000101f886b4 _PyObject_FastCallDict + 276
21  libpython3.7m.dylib                 0x0000000101fe7e5e object___reduce_ex__ + 174
22  libpython3.7m.dylib                 0x0000000101f896a4 _PyMethodDef_RawFastCallDict + 580
23  libpython3.7m.dylib                 0x0000000101f886b4 _PyObject_FastCallDict + 276
24  libpython3.7m.dylib                 0x0000000101f8acfb object_vacall + 619
25  libpython3.7m.dylib                 0x0000000101f8af00 PyObject_CallFunctionObjArgs + 144
26  _pickle.cpython-37m-darwin.so       0x0000000102bd6853 save + 13475
27  _pickle.cpython-37m-darwin.so       0x0000000102bd9d6a batch_dict + 490
28  _pickle.cpython-37m-darwin.so       0x0000000102bd8caf save_reduce + 1215
29  _pickle.cpython-37m-darwin.so       0x0000000102bd67fa save + 13386
30  _pickle.cpython-37m-darwin.so       0x0000000102bd5156 save + 7590
31  _pickle.cpython-37m-darwin.so       0x0000000102bd3047 dump + 279
32  _pickle.cpython-37m-darwin.so       0x0000000102be23a0 _pickle_Pickler_dump + 96
33  libpython3.7m.dylib                 0x0000000101f89b73 _PyMethodDef_RawFastCallKeywords + 771
34  libpython3.7m.dylib                 0x0000000101f8fc01 _PyMethodDescr_FastCallKeywords + 81
35  libpython3.7m.dylib                 0x000000010205f445 call_function + 869
36  libpython3.7m.dylib                 0x000000010205bf23 _PyEval_EvalFrameDefault + 29123
37  libpython3.7m.dylib                 0x0000000102060165 _PyEval_EvalCodeWithName + 2981
38  libpython3.7m.dylib                 0x0000000101f88e76 _PyFunction_FastCallKeywords + 230
39  libpython3.7m.dylib                 0x000000010205f409 call_function + 809
40  libpython3.7m.dylib                 0x000000010205bf43 _PyEval_EvalFrameDefault + 29155
41  libpython3.7m.dylib                 0x0000000101f893ad function_code_fastcall + 253
42  libpython3.7m.dylib                 0x000000010205c25f _PyEval_EvalFrameDefault + 29951
43  libpython3.7m.dylib                 0x0000000101f893ad function_code_fastcall + 253
44  libpython3.7m.dylib                 0x000000010205f409 call_function + 809
45  libpython3.7m.dylib                 0x000000010205bf23 _PyEval_EvalFrameDefault + 29123
46  libpython3.7m.dylib                 0x0000000101f893ad function_code_fastcall + 253
47  libpython3.7m.dylib                 0x000000010205f409 call_function + 809
48  libpython3.7m.dylib                 0x000000010205bf23 _PyEval_EvalFrameDefault + 29123
49  libpython3.7m.dylib                 0x0000000101f893ad function_code_fastcall + 253
50  libpython3.7m.dylib                 0x0000000101f89e66 _PyObject_Call_Prepend + 150
51  libpython3.7m.dylib                 0x0000000101f88fd8 PyObject_Call + 136
52  libpython3.7m.dylib                 0x00000001020f0b07 t_bootstrap + 71
53  libpython3.7m.dylib                 0x00000001020a38e9 pythread_wrapper + 25
54  libsystem_pthread.dylib             0x00007fff667ba2eb _pthread_body + 126
55  libsystem_pthread.dylib             0x00007fff667bd249 _pthread_start + 66
56  libsystem_pthread.dylib             0x00007fff667b940d thread_start + 13
------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.
------------------------------------------------------------------------
sage -t src/sage/numerical/linear_tensor_element.pyx
    Killed due to abort
**********************************************************************
Tests run before process (pid=13066) failed:
sage: mip.<x> = MixedIntegerLinearProgram('ppl')   # base ring is QQ ## line 6 ##
sage: lt = x[0] * vector([3,4]) + 1;   lt ## line 7 ##
(1, 1) + (3, 4)*x_0
sage: type(lt) ## line 9 ##
<class 'sage.numerical.linear_tensor_element.LinearTensor'>
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 11 ##
0
sage: parent = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 47 ##
sage: parent({0: [1,2], 3: [-7,-8]}) ## line 48 ##
(1.0, 2.0)*x_0 + (-7.0, -8.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 50 ##
0
sage: LT = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 70 ##
sage: LT({0: [1,2], 3: [-7,-8]}) ## line 71 ##
(1.0, 2.0)*x_0 + (-7.0, -8.0)*x_3
sage: TestSuite(LT).run(skip=['_test_an_element', '_test_elements_eq_reflexive',
    '_test_elements_eq_symmetric', '_test_elements_eq_transitive',
    '_test_elements_neq', '_test_additive_associativity',
    '_test_elements', '_test_pickling', '_test_zero']) ## line 74 ##
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 78 ##
0
sage: p = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 95 ##
sage: lt = p({0:[1,2], 3:[4,5]});  lt ## line 96 ##
(1.0, 2.0)*x_0 + (4.0, 5.0)*x_3
sage: lt[0] ## line 98 ##
x_0 + 4*x_3
sage: lt[1] ## line 100 ##
2*x_0 + 5*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 102 ##
0
sage: p = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 120 ##
sage: lt = p({0:[1,2], 3:[4,5]}) ## line 121 ##
sage: lt.dict() ## line 122 ##
{0: (1.0, 2.0), 3: (4.0, 5.0)}
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 124 ##
0
sage: mip.<b> = MixedIntegerLinearProgram() ## line 144 ##
sage: lt = vector([1,2]) * b[3] + vector([4,5]) * b[0] - 5;  lt ## line 145 ##
(-5.0, -5.0) + (1.0, 2.0)*x_0 + (4.0, 5.0)*x_1
sage: lt.coefficient(b[3]) ## line 147 ##
(1.0, 2.0)
sage: lt.coefficient(0)      # x_0 is b[3] ## line 149 ##
(1.0, 2.0)
sage: lt.coefficient(4) ## line 151 ##
(0.0, 0.0)
sage: lt.coefficient(-1) ## line 153 ##
(-5.0, -5.0)
sage: lt.coefficient(b[3] + b[4]) ## line 158 ##
sage: lt.coefficient(2*b[3]) ## line 162 ##
sage: mip.<q> = MixedIntegerLinearProgram(solver='ppl') ## line 166 ##
sage: lt.coefficient(q[0]) ## line 167 ##
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 171 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 197 ##
sage: R.<s,t> = RDF[] ## line 198 ##
sage: LT = LinearFunctionsParent(RDF).tensor(R) ## line 199 ##
sage: LT.an_element()  # indirect doctest ## line 200 ##
(s) + (5.0*s)*x_2 + (7.0*s)*x_5
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 203 ##
sage: LT.an_element()  # indirect doctest ## line 204 ##
(1.0, 0.0) + (5.0, 0.0)*x_2 + (7.0, 0.0)*x_5
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 206 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 235 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^(2,2)) ## line 236 ##
sage: LT.an_element()  # indirect doctest ## line 237 ##
[1 + 5*x_2 + 7*x_5 1 + 5*x_2 + 7*x_5]
[1 + 5*x_2 + 7*x_5 1 + 5*x_2 + 7*x_5]
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 240 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 278 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 279 ##
sage: LT({0: [1,2], 3: [-7,-8]}) + LT({2: [5,6], 3: [2,-2]}) + 16 ## line 280 ##
(16.0, 16.0) + (1.0, 2.0)*x_0 + (5.0, 6.0)*x_2 + (-5.0, -10.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 282 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 298 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 299 ##
sage: -LT({0: [1,2], 3: [-7,-8]}) ## line 300 ##
(-1.0, -2.0)*x_0 + (7.0, 8.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 302 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 322 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 323 ##
sage: LT({0: [1,2], 3: [-7,-8]}) - LT({1: [1,2]}) ## line 324 ##
(1.0, 2.0)*x_0 + (-1.0, -2.0)*x_1 + (-7.0, -8.0)*x_3
sage: LT({0: [1,2], 3: [-7,-8]}) - 16 ## line 326 ##
(-16.0, -16.0) + (1.0, 2.0)*x_0 + (-7.0, -8.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 328 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 348 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 349 ##
sage: 10 * LT({0: [1,2], 3: [-7,-8]}) ## line 350 ##
(10.0, 20.0)*x_0 + (-70.0, -80.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 352 ##
0
sage: mip.<x> = MixedIntegerLinearProgram() ## line 364 ##
sage: lt0 = x[0] * vector([1,2]) ## line 365 ##
sage: lt1 = x[1] * vector([2,3]) ## line 366 ##
sage: lt0.__le__(lt1)    # indirect doctest ## line 367 ##
(1.0, 2.0)*x_0 <= (2.0, 3.0)*x_1
sage: mip.<x> = MixedIntegerLinearProgram() ## line 372 ##
sage: from sage.numerical.linear_functions import LinearFunction ## line 373 ##
sage: x[0] * vector([1,2]) <= x[1] * vector([2,3]) ## line 374 ##
(1.0, 2.0)*x_0 <= (2.0, 3.0)*x_1
sage: x[0] * vector([1,2]) >= x[1] * vector([2,3]) ## line 377 ##
(2.0, 3.0)*x_1 <= (1.0, 2.0)*x_0
sage: x[0] * vector([1,2]) == x[1] * vector([2,3]) ## line 380 ##
(1.0, 2.0)*x_0 == (2.0, 3.0)*x_1
sage: x[0] * vector([1,2]) < x[1] * vector([2,3]) ## line 383 ##
sage: x[0] * vector([1,2]) > x[1] * vector([2,3]) ## line 388 ##
sage: lt = x[0] * vector([1,2]) ## line 395 ##
sage: cm = sage.structure.element.get_coercion_model() ## line 396 ##
sage: cm.explain(10, lt, operator.le) ## line 397 ##
Coercion on left operand via
    Coercion map:
      From: Integer Ring
      To:   Tensor product of Vector space of dimension 2 over Real Double Field and Linear functions over Real Double Field
Arithmetic performed after coercions.
Result lives in Tensor product of Vector space of dimension 2 over Real Double Field and Linear functions over Real Double Field
Tensor product of Vector space of dimension 2 over Real Double Field and Linear functions over Real Double Field
sage: operator.le(10, lt) ## line 406 ##
(10.0, 10.0) <= (1.0, 2.0)*x_0
sage: lt <= 1 ## line 408 ##
(1.0, 2.0)*x_0 <= (1.0, 1.0)
sage: lt >= 1 ## line 410 ##
(1.0, 1.0) <= (1.0, 2.0)*x_0
sage: 1 <= lt ## line 412 ##
(1.0, 1.0) <= (1.0, 2.0)*x_0
sage: 1 >= lt ## line 414 ##
(1.0, 2.0)*x_0 <= (1.0, 1.0)
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 416 ##
0
sage: p = MixedIntegerLinearProgram() ## line 444 ##
sage: lt0 = p[0] * vector([1,2]) ## line 445 ##
sage: hash(lt0)   # random output ## line 446 ##
-9223372036510900876
sage: d = {} ## line 448 ##
sage: d[lt0] = 3 ## line 449 ##
sage: f = p[0] * vector([1]) ## line 455 ##
sage: g = p[0] * vector([1]) ## line 456 ##
sage: set([f, f]) ## line 457 ##
{((1.0))*x_0}
sage: set([f, g]) ## line 459 ##
{((1.0))*x_0, ((1.0))*x_0}
sage: len(set([f, f+1])) ## line 461 ##
2
sage: d = {} ## line 464 ##
sage: d[f] = 123 ## line 465 ##
sage: d[g] = 456 ## line 466 ##
sage: len(list(d)) ## line 467 ##
2
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 469 ##
0

**********************************************************************
----------------------------------------------------------------------
sage -t src/sage/numerical/linear_tensor_element.pyx  # Killed due to abort
----------------------------------------------------------------------
Last edited 3 months ago by jhpalmieri (previous) (diff)

comment:8 Changed 3 months ago by slelievre

For reference this is also reported on the sage-release mailing list:

comment:9 follow-up: Changed 3 months ago by embray

I would be shocked, frankly, if the specific changes made as part of this ticket had anything to do with that failure. It's far more likely the same problem as discussed in #28106.

The infuriating thing about this problem is that it can occur in very seemingly arbitrary ways, and because it's very random it can befuddle git bisect (which I'm assuming is how you determined that it was "this ticket" causing the problem). Because the first time I encountered this problem I also git bisected to a change (in a branch I was working on at the time) that seemingly arbitrarily caused a similar problem. Eventually the problem just went away on its own, but then resurfaced again later.

Just any test that, for whatever reason happens to push past the default virtual memory limit of 3300 MB for test runners can cause this, and it's happening more commonly now due to the increased use of libgap and the additional constraint that places on the virtual memory allocation available to tests :(

comment:10 follow-up: Changed 3 months ago by embray

As a test, as you're able to reproduce the issue semi-reliably, try running it with ./sage -t --memlimit=0 and see if it can still be reproduced at all.

comment:11 Changed 3 months ago by embray

It appears, however, that this might have broken some tests on Cygwin on Python 2 now! I'm getting something that looks like this:

sage -t --long --warn-long 120.6 src/sage/doctest/test.py
**********************************************************************
File "src/sage/doctest/test.py", line 330, in sage.doctest.test
Failed example:
    print(open(t).read())  # long time
Exception raised:
    Traceback (most recent call last):
      File "/home/embray/src/sagemath/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 681, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/embray/src/sagemath/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 1123, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.doctest.test[34]>", line 1, in <module>
        print(open(t).read())  # long time
    IOError: [Errno 2] No such file or directory: '/home/embray/.sage/temp/NAVI-Brick/2229/tmp_aqE2sf'

Haven't confirmed that it's caused by this change yet, but it at least does look directly relevant in this case.

It's only that doctest that fails though--there is no problem running the test runner itself AFAICT, so I'm not hard-pressed about it. Would be good to fix though if I can do so quickly.

comment:12 in reply to: ↑ 9 Changed 3 months ago by jhpalmieri

Replying to embray:

I would be shocked, frankly, if the specific changes made as part of this ticket had anything to do with that failure. It's far more likely the same problem as discussed in #28106.

The infuriating thing about this problem is that it can occur in very seemingly arbitrary ways, and because it's very random it can befuddle git bisect (which I'm assuming is how you determined that it was "this ticket" causing the problem).

I used git bisect to try to find the problem, and it didn't lead anywhere helpful, so I started merging tickets one by one, working my way from 8.9.beta8 to 8.9.beta9. That's how I found this ticket. Having done that, I restarted with 8.9.beta8: doctests for this file pass 10 times in a row. Then I applied just the diff for this ticket: doctests for this file fail more than half the time: 7 out of 10 times, then 9 out of 10 times. Revert the changes and tests pass again. It's very reproducible. So it really does look like this ticket is the problem.

Having said that, it's only on one machine; I can't reproduce it on a second one running the same version of OS X.

comment:13 in reply to: ↑ 10 Changed 3 months ago by jhpalmieri

Replying to embray:

As a test, as you're able to reproduce the issue semi-reliably, try running it with ./sage -t --memlimit=0 and see if it can still be reproduced at all.

This didn't change the behavior: failures with the changes here applied, successes with vanilla 8.9.beta8.

comment:14 Changed 3 months ago by jhpalmieri

I rebuilt from scratch, and I get the same behavior.

comment:15 Changed 3 months ago by jhpalmieri

If I delete the file ~/.sage/timings2.json, everything works. I'm puzzled — why should problems with that file affect this particular doctest, and what is the connection with the changes on this ticket? — but this makes it very low priority.

comment:16 Changed 3 months ago by embray

If not due to --memlimit it's still got to be something else bizarre and random that is probably not directly caused by this, but still randomly invoked by it. Could be wrong, but without a way to reproduce we may never know.

comment:17 Changed 2 months ago by jhpalmieri

I am now seeing this on a second machine, so I've opened a ticket: #28559.

Note: See TracTickets for help on using tickets.