Opened 3 years ago

Last modified 2 months ago

#28559 new defect

py3 + linear_tensor_element.pyx

Reported by: jhpalmieri Owned by:
Priority: major Milestone: sage-9.7
Component: python3 Keywords: random_fail
Cc: gh-collares Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by jhpalmieri)

Intermittent failure with linear_tensor_element.pyx on OS X with a Python 3 build of Sage:

sage -t --warn-long 58.7 src/sage/numerical/linear_tensor_element.pyx  # Killed due to abort

In more detail:

$ ./sage -t src/sage/numerical/linear_tensor_element.pyx 
Running doctests with ID 2019-10-05-11-47-01-e1b44413.
Git branch: develop
Using --optional=build,dochtml,sage
Doctesting 1 file.
glp_free: memory allocation error
Error detected in file env/alloc.c at line 72
------------------------------------------------------------------------
0   signals.cpython-37m-darwin.so       0x000000010ed22f0a print_backtrace + 58
1   signals.cpython-37m-darwin.so       0x000000010ed271b7 sigdie + 39
2   signals.cpython-37m-darwin.so       0x000000010ed27130 sigdie_for_sig + 256
3   libsystem_platform.dylib            0x00007fff621d7b5d _sigtramp + 29
4   libglpk.40.dylib                    0x0000000152fbb020 libglpk.40.dylib + 32
5   libsystem_c.dylib                   0x00007fff620916a6 abort + 127
6   libglpk.40.dylib                    0x0000000153019bd4 errfunc + 212
7   libglpk.40.dylib                    0x00000001530193c8 dma + 184
8   libglpk.40.dylib                    0x000000015302b02f _glp_dmp_delete_pool + 47
9   libglpk.40.dylib                    0x0000000152fd6621 delete_prob + 17
10  libglpk.40.dylib                    0x0000000152fd66d2 glp_delete_prob + 66
11  glpk_backend.cpython-37m-darwin.so  0x0000000152f7740c __pyx_tp_dealloc_4sage_9numerical_8backends_12glpk_backend_GLPKBackend + 76
12  mip.cpython-37m-darwin.so           0x000000014f5bc91e __pyx_tp_clear_4sage_9numerical_3mip_MixedIntegerLinearProgram + 78
13  libpython3.7m.dylib                 0x000000010d37316c collect + 2204
14  libpython3.7m.dylib                 0x000000010d3738a2 _PyObject_GC_Alloc + 386
15  libpython3.7m.dylib                 0x000000010d373914 _PyObject_GC_New + 20
16  libpython3.7m.dylib                 0x000000010d273c78 dictiter_new + 24
17  libpython3.7m.dylib                 0x000000010d22c918 PyObject_GetIter + 24
18  libpython3.7m.dylib                 0x000000010d39aaeb defdict_reduce + 107
19  libpython3.7m.dylib                 0x000000010d242cbc _PyMethodDef_RawFastCallDict + 588
20  libpython3.7m.dylib                 0x000000010d241cde _PyObject_FastCallDict + 270
21  libpython3.7m.dylib                 0x000000010d2a0cfe object___reduce_ex__ + 174
22  libpython3.7m.dylib                 0x000000010d242cbc _PyMethodDef_RawFastCallDict + 588
23  libpython3.7m.dylib                 0x000000010d241cde _PyObject_FastCallDict + 270
24  libpython3.7m.dylib                 0x000000010d2442db object_vacall + 619
25  libpython3.7m.dylib                 0x000000010d2444e0 PyObject_CallFunctionObjArgs + 144
26  _pickle.cpython-37m-darwin.so       0x000000010ddc9b82 save + 12626
27  _pickle.cpython-37m-darwin.so       0x000000010ddccf0a batch_dict + 490
28  _pickle.cpython-37m-darwin.so       0x000000010ddcbe56 save_reduce + 1142
29  _pickle.cpython-37m-darwin.so       0x000000010ddc9b29 save + 12537
30  _pickle.cpython-37m-darwin.so       0x000000010ddc8674 save + 7236
31  _pickle.cpython-37m-darwin.so       0x000000010ddc6701 dump + 257
32  _pickle.cpython-37m-darwin.so       0x000000010ddd5390 _pickle_Pickler_dump + 96
33  libpython3.7m.dylib                 0x000000010d243197 _PyMethodDef_RawFastCallKeywords + 775
34  libpython3.7m.dylib                 0x000000010d249181 _PyMethodDescr_FastCallKeywords + 81
35  libpython3.7m.dylib                 0x000000010d317538 call_function + 888
36  libpython3.7m.dylib                 0x000000010d313efe _PyEval_EvalFrameDefault + 27230
37  libpython3.7m.dylib                 0x000000010d31824d _PyEval_EvalCodeWithName + 3005
38  libpython3.7m.dylib                 0x000000010d242499 _PyFunction_FastCallKeywords + 217
39  libpython3.7m.dylib                 0x000000010d3174cc call_function + 780
40  libpython3.7m.dylib                 0x000000010d313f1e _PyEval_EvalFrameDefault + 27262
41  libpython3.7m.dylib                 0x000000010d2429bd function_code_fastcall + 237
42  libpython3.7m.dylib                 0x000000010d314140 _PyEval_EvalFrameDefault + 27808
43  libpython3.7m.dylib                 0x000000010d2429bd function_code_fastcall + 237
44  libpython3.7m.dylib                 0x000000010d3174cc call_function + 780
45  libpython3.7m.dylib                 0x000000010d313efe _PyEval_EvalFrameDefault + 27230
46  libpython3.7m.dylib                 0x000000010d2429bd function_code_fastcall + 237
47  libpython3.7m.dylib                 0x000000010d3174cc call_function + 780
48  libpython3.7m.dylib                 0x000000010d313efe _PyEval_EvalFrameDefault + 27230
49  libpython3.7m.dylib                 0x000000010d2429bd function_code_fastcall + 237
50  libpython3.7m.dylib                 0x000000010d243463 _PyObject_Call_Prepend + 131
51  libpython3.7m.dylib                 0x000000010d2425f8 PyObject_Call + 136
52  libpython3.7m.dylib                 0x000000010d3a7e87 t_bootstrap + 71
53  libpython3.7m.dylib                 0x000000010d35b699 pythread_wrapper + 25
54  libsystem_pthread.dylib             0x00007fff621e02eb _pthread_body + 126
55  libsystem_pthread.dylib             0x00007fff621e3249 _pthread_start + 66
56  libsystem_pthread.dylib             0x00007fff621df40d thread_start + 13
------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.
------------------------------------------------------------------------
sage -t --warn-long 58.7 src/sage/numerical/linear_tensor_element.pyx
    Killed due to abort
**********************************************************************
Tests run before process (pid=66448) failed:
sage: mip.<x> = MixedIntegerLinearProgram('ppl')   # base ring is QQ ## line 6 ##
sage: lt = x[0] * vector([3,4]) + 1;   lt ## line 7 ##
(1, 1) + (3, 4)*x_0
sage: type(lt) ## line 9 ##
<class 'sage.numerical.linear_tensor_element.LinearTensor'>
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 11 ##
0
sage: parent = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 47 ##
sage: parent({0: [1,2], 3: [-7,-8]}) ## line 48 ##
(1.0, 2.0)*x_0 + (-7.0, -8.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 50 ##
0
sage: LT = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 70 ##
sage: LT({0: [1,2], 3: [-7,-8]}) ## line 71 ##
(1.0, 2.0)*x_0 + (-7.0, -8.0)*x_3
sage: TestSuite(LT).run(skip=['_test_an_element', '_test_elements_eq_reflexive',
    '_test_elements_eq_symmetric', '_test_elements_eq_transitive',
    '_test_elements_neq', '_test_additive_associativity',
    '_test_elements', '_test_pickling', '_test_zero']) ## line 74 ##
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 78 ##
0
sage: p = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 95 ##
sage: lt = p({0:[1,2], 3:[4,5]});  lt ## line 96 ##
(1.0, 2.0)*x_0 + (4.0, 5.0)*x_3
sage: lt[0] ## line 98 ##
x_0 + 4*x_3
sage: lt[1] ## line 100 ##
2*x_0 + 5*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 102 ##
0
sage: p = MixedIntegerLinearProgram().linear_functions_parent().tensor(RDF^2) ## line 120 ##
sage: lt = p({0:[1,2], 3:[4,5]}) ## line 121 ##
sage: lt.dict() ## line 122 ##
{0: (1.0, 2.0), 3: (4.0, 5.0)}
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 124 ##
0
sage: mip.<b> = MixedIntegerLinearProgram() ## line 144 ##
sage: lt = vector([1,2]) * b[3] + vector([4,5]) * b[0] - 5;  lt ## line 145 ##
(-5.0, -5.0) + (1.0, 2.0)*x_0 + (4.0, 5.0)*x_1
sage: lt.coefficient(b[3]) ## line 147 ##
(1.0, 2.0)
sage: lt.coefficient(0)      # x_0 is b[3] ## line 149 ##
(1.0, 2.0)
sage: lt.coefficient(4) ## line 151 ##
(0.0, 0.0)
sage: lt.coefficient(-1) ## line 153 ##
(-5.0, -5.0)
sage: lt.coefficient(b[3] + b[4]) ## line 158 ##
sage: lt.coefficient(2*b[3]) ## line 162 ##
sage: mip.<q> = MixedIntegerLinearProgram(solver='ppl') ## line 166 ##
sage: lt.coefficient(q[0]) ## line 167 ##
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 171 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 197 ##
sage: R.<s,t> = RDF[] ## line 198 ##
sage: LT = LinearFunctionsParent(RDF).tensor(R) ## line 199 ##
sage: LT.an_element()  # indirect doctest ## line 200 ##
(s) + (5.0*s)*x_2 + (7.0*s)*x_5
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 203 ##
sage: LT.an_element()  # indirect doctest ## line 204 ##
(1.0, 0.0) + (5.0, 0.0)*x_2 + (7.0, 0.0)*x_5
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 206 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 235 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^(2,2)) ## line 236 ##
sage: LT.an_element()  # indirect doctest ## line 237 ##
[1 + 5*x_2 + 7*x_5 1 + 5*x_2 + 7*x_5]
[1 + 5*x_2 + 7*x_5 1 + 5*x_2 + 7*x_5]
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 240 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 278 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 279 ##
sage: LT({0: [1,2], 3: [-7,-8]}) + LT({2: [5,6], 3: [2,-2]}) + 16 ## line 280 ##
(16.0, 16.0) + (1.0, 2.0)*x_0 + (5.0, 6.0)*x_2 + (-5.0, -10.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 282 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 298 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 299 ##
sage: -LT({0: [1,2], 3: [-7,-8]}) ## line 300 ##
(-1.0, -2.0)*x_0 + (7.0, 8.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 302 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 322 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 323 ##
sage: LT({0: [1,2], 3: [-7,-8]}) - LT({1: [1,2]}) ## line 324 ##
(1.0, 2.0)*x_0 + (-1.0, -2.0)*x_1 + (-7.0, -8.0)*x_3
sage: LT({0: [1,2], 3: [-7,-8]}) - 16 ## line 326 ##
(-16.0, -16.0) + (1.0, 2.0)*x_0 + (-7.0, -8.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 328 ##
0
sage: from sage.numerical.linear_functions import LinearFunctionsParent ## line 348 ##
sage: LT = LinearFunctionsParent(RDF).tensor(RDF^2) ## line 349 ##
sage: 10 * LT({0: [1,2], 3: [-7,-8]}) ## line 350 ##
(10.0, 20.0)*x_0 + (-70.0, -80.0)*x_3
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 352 ##
0
sage: mip.<x> = MixedIntegerLinearProgram() ## line 364 ##
sage: lt0 = x[0] * vector([1,2]) ## line 365 ##
sage: lt1 = x[1] * vector([2,3]) ## line 366 ##
sage: lt0.__le__(lt1)    # indirect doctest ## line 367 ##
(1.0, 2.0)*x_0 <= (2.0, 3.0)*x_1
sage: mip.<x> = MixedIntegerLinearProgram() ## line 372 ##
sage: from sage.numerical.linear_functions import LinearFunction ## line 373 ##
sage: x[0] * vector([1,2]) <= x[1] * vector([2,3]) ## line 374 ##
(1.0, 2.0)*x_0 <= (2.0, 3.0)*x_1
sage: x[0] * vector([1,2]) >= x[1] * vector([2,3]) ## line 377 ##
(2.0, 3.0)*x_1 <= (1.0, 2.0)*x_0
sage: x[0] * vector([1,2]) == x[1] * vector([2,3]) ## line 380 ##
(1.0, 2.0)*x_0 == (2.0, 3.0)*x_1
sage: x[0] * vector([1,2]) < x[1] * vector([2,3]) ## line 383 ##
sage: x[0] * vector([1,2]) > x[1] * vector([2,3]) ## line 388 ##
sage: lt = x[0] * vector([1,2]) ## line 395 ##
sage: cm = sage.structure.element.get_coercion_model() ## line 396 ##
sage: cm.explain(10, lt, operator.le) ## line 397 ##
Coercion on left operand via
    Coercion map:
      From: Integer Ring
      To:   Tensor product of Vector space of dimension 2 over Real Double Field and Linear functions over Real Double Field
Arithmetic performed after coercions.
Result lives in Tensor product of Vector space of dimension 2 over Real Double Field and Linear functions over Real Double Field
Tensor product of Vector space of dimension 2 over Real Double Field and Linear functions over Real Double Field
sage: operator.le(10, lt) ## line 406 ##
(10.0, 10.0) <= (1.0, 2.0)*x_0
sage: lt <= 1 ## line 408 ##
(1.0, 2.0)*x_0 <= (1.0, 1.0)
sage: lt >= 1 ## line 410 ##
(1.0, 1.0) <= (1.0, 2.0)*x_0
sage: 1 <= lt ## line 412 ##
(1.0, 1.0) <= (1.0, 2.0)*x_0
sage: 1 >= lt ## line 414 ##
(1.0, 2.0)*x_0 <= (1.0, 1.0)
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 416 ##
0
sage: p = MixedIntegerLinearProgram() ## line 444 ##
sage: lt0 = p[0] * vector([1,2]) ## line 445 ##
sage: hash(lt0)   # random output ## line 446 ##
-9223372036499170180
sage: d = {} ## line 448 ##
sage: d[lt0] = 3 ## line 449 ##
sage: f = p[0] * vector([1]) ## line 455 ##
sage: g = p[0] * vector([1]) ## line 456 ##
sage: set([f, f]) ## line 457 ##
{((1.0))*x_0}
sage: set([f, g]) ## line 459 ##
{((1.0))*x_0, ((1.0))*x_0}
sage: len(set([f, f+1])) ## line 461 ##
2
sage: d = {} ## line 464 ##
sage: d[f] = 123 ## line 465 ##
sage: d[g] = 456 ## line 466 ##
sage: len(list(d)) ## line 467 ##
2
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 469 ##
0

This has been discussed at #27587, but I think it deserves its own ticket. For some reason, this change makes the failure go away:

  • src/sage/numerical/linear_tensor_element.pyx

    diff --git a/src/sage/numerical/linear_tensor_element.pyx b/src/sage/numerical/linear_tensor_element.pyx
    index 597f96f953..fbd1f58a45 100644
    a b cdef class LinearTensor(ModuleElement): 
    380380            sage: x[0] * vector([1,2]) == x[1] * vector([2,3])
    381381            (1.0, 2.0)*x_0 == (2.0, 3.0)*x_1
    382382
    383             sage: x[0] * vector([1,2]) < x[1] * vector([2,3])
    384             Traceback (most recent call last):
    385             ...
    386             ValueError: strict < is not allowed, use <= instead.
    387 
    388             sage: x[0] * vector([1,2]) > x[1] * vector([2,3])
    389             Traceback (most recent call last):
    390             ...
    391             ValueError: strict > is not allowed, use >= instead.
    392 
    393383        TESTS::
    394384
    395385            sage: lt = x[0] * vector([1,2])

but I don't know why.

Change History (19)

comment:1 Changed 3 years ago by jhpalmieri

  • Description modified (diff)

comment:2 Changed 3 years ago by gh-mwageringel

  • Summary changed from py3 + OS X + linear_tensor_element.pyx to py3 + linear_tensor_element.pyx

This problem still exists and is not limited to OS X. It appears in the patchbot results occasionally, for example here based on 9.0beta7: CentOS, LinuxMint.

comment:3 Changed 3 years ago by klee

No one proposes a solution. How about adopting John's temporary measure here just to push sage on python 3? We can create a regular ticket to further track the issue.

comment:4 Changed 3 years ago by chapoton

Indeed, this also happening with LinuxMint?:

https://patchbot.sagemath.org/log/0/LinuxMint/19.2/x86_64/4.15.0-65-generic/pc72/2019-12-02%2002:01:40

There is no urgency to fix this for python3. The switch to python3 will happen very soon anyway.

comment:5 Changed 3 years ago by vbraun

  • Keywords random_fail added

comment:6 Changed 3 years ago by embray

I'm still surprised this isn't similar or related to #28106. Memory exhaustion is the most likely culprit for random failures like this.

comment:7 Changed 3 years ago by vbraun

I think there is an underlying memory corruption bug here.

  • The test should be almost trivial, doesn't use a significant amount of memory
  • The traceback is from when the glpk memory structure is freed, after the computation succeded

The glpk pool allocator has some headers on allocated memory regions to check violations, and this is being triggered here.

comment:8 Changed 2 years ago by embray

  • Milestone changed from sage-9.0 to sage-9.1

Ticket retargeted after milestone closed

comment:9 Changed 2 years ago by mkoeppe

  • Milestone changed from sage-9.1 to sage-9.2

Moving tickets to milestone sage-9.2 based on a review of last modification date, branch status, and severity.

comment:10 Changed 23 months ago by mkoeppe

  • Milestone changed from sage-9.2 to sage-9.3

comment:11 Changed 17 months ago by mkoeppe

  • Milestone changed from sage-9.3 to sage-9.4

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

comment:12 Changed 16 months ago by gh-collares

  • Cc gh-collares added

comment:13 follow-up: Changed 16 months ago by jhpalmieri

I haven't seen this problem in a long time. Has anyone else?

comment:14 Changed 16 months ago by gh-collares

comment:15 Changed 15 months ago by gh-collares

It's easy to find examples of thread-safety issues related to GLPK in other projects, such as https://github.com/jyp/glpk-hs/pull/9. I don't know how Cython's __dealloc__ works, but could it interact badly with GLPK's use of thread-local storage?

comment:16 in reply to: ↑ 13 Changed 14 months ago by gh-mwageringel

Replying to jhpalmieri:

I haven't seen this problem in a long time. Has anyone else?

A few days ago, this happened on a patchbot with Debian.

comment:17 Changed 11 months ago by mkoeppe

  • Milestone changed from sage-9.4 to sage-9.5

comment:18 Changed 6 months ago by mkoeppe

  • Milestone changed from sage-9.5 to sage-9.6

comment:19 Changed 2 months ago by mkoeppe

  • Milestone changed from sage-9.6 to sage-9.7
Note: See TracTickets for help on using tickets.