Opened 8 years ago

Closed 7 years ago

#12215 closed defect (fixed)

Memleak in UniqueRepresentation, @cached_method

Reported by: vbraun Owned by:
Priority: major Milestone: sage-5.7
Component: memleak Keywords: UniqueRepresentation cached_method caching
Cc: SimonKing, jdemeyer, mhansen, vbraun, jpflori Merged in: sage-5.7.beta1
Authors: Simon King Reviewers: Nils Bruin
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by SimonKing)

The documentation says that UniqueRepresentation? uses weak refs, but this was switched over to the @cached_method decorator. The latter does currently use strong references, so unused unique parents stay in memory forever:

import sage.structure.unique_representation
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

for i in range(2,1000):
    ring = ZZ.quotient(ZZ(i))
    vectorspace = ring^2

import gc
gc.collect()
len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)

Related tickets:

  • #11521 (needs review, introducing weak references for caching homsets), and
  • #715 (needs review, using weak references for caching coerce maps).
  • #5970 (the polynomial rings cache use strong references, which may now be a duplicate, as I introduce the weak cache in #715)

Further notes:

  • not everything in Python can be weakref'ed, for example None cannot.
  • some results that are expensive to compute should not just be cached by a weak reference. Perhaps there is place for a permanent cache, or maybe some minimal age before garbage collecting it.

Apply

Attachments (3)

trac12215_weak_cached_function_combined.patch (19.2 KB) - added by SimonKing 7 years ago.
Implement a weak version of cached_function, and use it for UniqueRepresentation. Properly use WeakValueDictionary in UniqueFactory. Combined patch
sage_crash_WgD9iG.log (123.4 KB) - added by SimonKing 7 years ago.
Crash log
trac12215_safe_callback.patch (1.3 KB) - added by SimonKing 7 years ago.
Safer callback in TripleDictEraser

Download all attachments as: .zip

Change History (159)

comment:1 Changed 8 years ago by vbraun

  • Cc SimonKing added

comment:2 Changed 8 years ago by SimonKing

  • Description modified (diff)

comment:3 Changed 8 years ago by SimonKing

See my comment at #5970: It seems that having a weak version of cached_function (which is used to decorate UniqueRepresentation.__classcall__ is the missing bit (in addition to #11521 and #715 and a two-line change in the polynomial ring constructor) for fixing the issues at #5970.

I think this should be done on top of #11115, which rewrites cached methods and already has a positive review.

comment:4 Changed 8 years ago by SimonKing

  • Dependencies set to #11115

comment:5 Changed 8 years ago by SimonKing

Here is a patch. It isn't tested yet.

comment:6 Changed 8 years ago by SimonKing

  • Dependencies changed from #11115 to #11115 #11900

... and I immediately updated the patch: Join categories were not using unique representation but cached_function (by #11900). So, that had to change.

comment:7 Changed 8 years ago by SimonKing

Sorry, it was impossible to use weak_cached_function on the join function in sage.categories.category, since it may return a list (not weakly referenceable). Hence, I had to work around. With the attached patch (applied on top of #11900 and its dependencies), sage at least starts...

comment:8 Changed 8 years ago by SimonKing

It turns out that all the patches can still not fix the problem. We also have to deal with sage.structure.factory.UniqueFactory.

I suggest to add an option to UniqueFactory, that decides whether a strong or a weak cache is used. And I suggest to do this here, because I don't want to create yet another ticket.

The applications of UniqueFactory should mainly be in cases where weak references work. Therefore I suggest to use the weak cache by default - I am curious how many doc tests will fail...

Coercion sucks.

comment:9 Changed 8 years ago by SimonKing

It turns out that UniqueFactory already was somehow using weak references, but in an improper way. The new patch version replaces that by WeakValueDictionary.

It doesn't solve the problem, though.

comment:10 Changed 8 years ago by SimonKing

I have slightly updated my patch, so that there is no conflict with #11935.

comment:11 Changed 8 years ago by SimonKing

There is yet another location where it makes sense to use @weak_cached_function: For the cache of dynamic classes!

Namely, dynamic classes are frequently used in the category framework, they have a strong cache, and the parent/element classes keep a pointer to the category they belong to. So, that's preventing categories from being garbage collected.

I think that my patches from here, #715, and #11935 (which reduces the number of dynamic classes created) might actually be enough to fix the problem. When I run

sage: for p in primes(2,1000000):
....:     R = GF(p)['x','y','z']
....:     print get_memory_usage()

then one initially still sees an increased memory usage. But after a while it seems to stabilise.

comment:12 Changed 8 years ago by SimonKing

  • Description modified (diff)
  • Status changed from new to needs_review

I have updated the patch. It documents the changes, and at least the tests in sage/misc/cachefunc.pyx, in sage/categories/..., in sage/rings/... and in sage/structure/unique_representation.py pass.

Hence, needs review!

comment:13 Changed 8 years ago by SimonKing

  • Authors set to Simon King

comment:14 Changed 8 years ago by SimonKing

  • Description modified (diff)

comment:15 Changed 8 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to segfaults for elliptic curves

While the tests in sage/categories, sage/rings and sage/structure/unique_representation.py pass, I get some segfaults for the elliptic curve tests. Thus, needs work.

comment:16 Changed 8 years ago by SimonKing

I did sage -t --verbose "devel/sage-main/sage/schemes/elliptic_curves/ell_point.py", and it did not reveal a segfault while running the tests. The test process itself crashed:

830 tests in 54 items.
830 passed and 0 failed.
Test passed.
The doctested process was killed by signal 11
         [23.8 s]
 
----------------------------------------------------------------------
The following tests failed:


        sage -t --verbose "devel/sage-main/sage/schemes/elliptic_curves/ell_point.py" # Killed/crashed

Strange.

comment:17 Changed 8 years ago by SimonKing

I think I found the problem.

Some doctest of the form

sage: K.residue_field()
<expected answer>

segfaults. But when the result is assigned to a variable, like this

sage: RF = K.residue_field(); RF
<expected answer>

then everything works.

Is it perhaps the case that garbage collection of the residue field (that was enabled by my patch) happens between the creation and the computation of the string representation of the object?

But that is strange. There are variables _ and __, which are supposed to provide strong references to the last two results - hence, there should be no garbage collection.

comment:18 Changed 8 years ago by SimonKing

sage.structure.factory.UniqueFactory did use weak references before. But it did so - I think - improperly, namely without using weakref.WeakValueDictionary. The new patch version changes that.

It isn't ready for review, yet, because of the segfaults.

comment:19 Changed 8 years ago by SimonKing

Some old code is not using the cache: There was some coerce map created in sage/rings/residue_field.pyx, whose parent was not created by Hom(domain,codomain), but directly by RingHomset(domain,codomain).

Changing it fixed at least one segfault. I wish all segfaults would go away so easily...

comment:20 Changed 8 years ago by SimonKing

Fortunately, I now have a short example that triggers a memory access error when leaving Sage:

sage: E = EllipticCurve('15a1')
sage: K.<t>=NumberField(x^2+2*x+10)
sage: EK=E.base_extend(K)
sage: EK.torsion_subgroup()
Torsion Subgroup isomorphic to Z/4 + Z/4 associated to the Elliptic Curve defined by y^2 + x*y + y = x^3 + x^2 + (-10)*x + (-10) over Number Field in t with defining polynomial x^2 + 2*x + 10
sage: quit
Exiting Sage (CPU time 0m1.98s, Wall time 0m52.03s).
local/bin/sage-sage: Zeile 303: 30045 Speicherzugriffsfehler  sage-ipython "$@" -i

However, I wonder how I can trigger the error without leaving Sage, and how I can trace what is going on.

comment:21 Changed 8 years ago by SimonKing

Actually EK._torsion_bound(number_of_places=20) is enough to trigger the memory access error.

comment:22 Changed 8 years ago by vbraun

Here is the stack:

Program terminated with signal 11, Segmentation fault.
#0  cgetg (y=22, x=<optimized out>) at ../src/kernel/none/level1.h:114
114	../src/kernel/none/level1.h: No such file or directory.
	in ../src/kernel/none/level1.h
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.16-gdb.py", line 59, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
  File "/usr/lib64/../share/gcc-4.6.2/python/libstdcxx/v6/printers.py", line 19, in <module>
    import itertools
ImportError: No module named itertools
Missing separate debuginfos, use: debuginfo-install atlas-3.8.4-1.fc16.x86_64 expat-2.0.1-11.fc15.x86_64 fontconfig-2.8.0-4.fc16.x86_64 keyutils-libs-1.5.2-1.fc16.x86_64 krb5-libs-1.9.2-4.fc16.x86_64 libcom_err-1.41.14-2.fc15.x86_64 libselinux-2.1.6-5.fc16.x86_64 ncurses-libs-5.9-2.20110716.fc16.x86_64 openssl-1.0.0e-1.fc16.x86_64
(gdb) bt
#0  cgetg (y=22, x=<optimized out>) at ../src/kernel/none/level1.h:114
#1  convi (x=0x288b2a8, l=0x7fff6a2f8a38) at ../src/kernel/gmp/mp.c:1288
#2  0x00007f11fb1637ec in itostr_sign (x=<optimized out>, sx=1, len=0x7fff6a2f8b48) at ../src/language/es.c:500
#3  0x00007f11fb167b4f in str_absint (x=0x288b2a8, S=0x7fff6a2f8cb0) at ../src/language/es.c:1778
#4  bruti_intern (g=0x288b2a8, T=<optimized out>, S=0x7fff6a2f8cb0, addsign=1) at ../src/language/es.c:2557
#5  0x00007f11fb168453 in bruti_intern (g=0x288b2d8, T=0x7f11fb4b27a0, S=0x7fff6a2f8cb0, addsign=<optimized out>)
    at ../src/language/es.c:2730
#6  0x00007f11fb1679ae in GENtostr_fun (out=0x7f11fb16a7b0 <bruti>, T=0x7f11fb4b27a0, x=0x288b2d8)
    at ../src/language/es.c:1645
#7  GENtostr (x=0x288b2d8) at ../src/language/es.c:1651
#8  0x00007f11f5ae5c44 in gcmp_sage (y=0x583d1b8, x=<optimized out>) at sage/libs/pari/misc.h:60
#9  __pyx_f_4sage_4libs_4pari_3gen_3gen__cmp_c_impl (__pyx_v_left=<optimized out>, __pyx_v_right=<optimized out>)
    at sage/libs/pari/gen.c:8513
#10 0x00007f11f8663227 in __pyx_f_4sage_9structure_7element_7Element__richcmp_c_impl (__pyx_v_left=0x5780e10, 
    __pyx_v_right=<optimized out>, __pyx_v_op=2) at sage/structure/element.c:7775
#11 0x00007f11f86875ec in __pyx_f_4sage_9structure_7element_7Element__richcmp (__pyx_v_left=0x5780e10, 
    __pyx_v_right=0x5863f70, __pyx_v_op=2) at sage/structure/element.c:7498
#12 0x00007f11f5ae045b in __pyx_pf_4sage_4libs_4pari_3gen_3gen_44__richcmp__ (__pyx_v_left=<optimized out>, 
    __pyx_v_right=<optimized out>, __pyx_v_op=<optimized out>) at sage/libs/pari/gen.c:8475
#13 0x00007f1208b32e6a in try_rich_compare (v=0x5780e10, w=0x5863f70, op=2) at Objects/object.c:619
#14 0x00007f1208b3518d in try_rich_compare_bool (op=<optimized out>, w=<optimized out>, v=<optimized out>)
    at Objects/object.c:647
#15 try_rich_to_3way_compare (w=0x5863f70, v=0x5780e10) at Objects/object.c:681
#16 do_cmp (w=0x5863f70, v=0x5780e10) at Objects/object.c:834
#17 PyObject_Compare (v=0x5780e10, w=0x5863f70) at Objects/object.c:863
#18 0x00007f1208af5ae5 in PyObject_Cmp (o1=<optimized out>, o2=<optimized out>, result=0x7fff6a2f8f0c)
    at Objects/abstract.c:41
#19 0x00007f1208b879d4 in builtin_cmp (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:422
#20 0x00007f1208b917fd in call_function (oparg=<optimized out>, pp_stack=0x7fff6a2f9000) at Python/ceval.c:3706
#21 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389
#22 0x00007f1208b934d9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, 
    args=<optimized out>, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:2968
#23 0x00007f1208b1f7f6 in function_call (func=0x1fec9b0, arg=0x5864518, kw=0x0) at Objects/funcobject.c:524
#24 0x00007f1208af97a3 in PyObject_Call (func=0x1fec9b0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#25 0x00007f1208b0667f in instancemethod_call (func=0x1fec9b0, arg=0x5864518, kw=0x0)
    at Objects/classobject.c:2579
#26 0x00007f1208af97a3 in PyObject_Call (func=0x579d0f0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#27 0x00007f1208b545c6 in half_compare (self=<optimized out>, other=<optimized out>) at Objects/typeobject.c:5253
#28 0x00007f1208b547a5 in _PyObject_SlotCompare (self=0x5713af0, other=0x5866af0) at Objects/typeobject.c:5278
#29 0x00007f1208b35260 in do_cmp (w=0x5866af0, v=0x5713af0) at Objects/object.c:817
#30 PyObject_Compare (v=0x5713af0, w=0x5866af0) at Objects/object.c:863
#31 0x00007f1208af5ae5 in PyObject_Cmp (o1=<optimized out>, o2=<optimized out>, result=0x7fff6a2f955c)
    at Objects/abstract.c:41
#32 0x00007f1208b879d4 in builtin_cmp (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:422
#33 0x00007f1208af97a3 in PyObject_Call (func=0x7f120903e2d8, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#34 0x00007f11e5b541fc in __pyx_pf_4sage_5rings_13residue_field_20ResidueField_generic_8__cmp__ (
    __pyx_self=<optimized out>, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>)
    at sage/rings/residue_field.c:7317
#35 0x00007f1208af97a3 in PyObject_Call (func=0x22e7200, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
---Type <return> to continue, or q <return> to quit---
#36 0x00007f1208b0667f in instancemethod_call (func=0x22e7200, arg=0x586d320, kw=0x0)
    at Objects/classobject.c:2579
#37 0x00007f1208af97a3 in PyObject_Call (func=0x4e32aa0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#38 0x00007f1208b545c6 in half_compare (self=<optimized out>, other=<optimized out>) at Objects/typeobject.c:5253
#39 0x00007f1208b547a5 in _PyObject_SlotCompare (self=0x57f2410, other=0x5906e90) at Objects/typeobject.c:5278
#40 0x00007f1208b35260 in do_cmp (w=0x5906e90, v=0x57f2410) at Objects/object.c:817
#41 PyObject_Compare (v=0x57f2410, w=0x5906e90) at Objects/object.c:863
#42 0x00007f1208af5ae5 in PyObject_Cmp (o1=<optimized out>, o2=<optimized out>, result=0x7fff6a2f99bc)
    at Objects/abstract.c:41
#43 0x00007f1208b879d4 in builtin_cmp (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:422
#44 0x00007f1208b917fd in call_function (oparg=<optimized out>, pp_stack=0x7fff6a2f9ab0) at Python/ceval.c:3706
#45 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389
#46 0x00007f1208b92593 in fast_function (nk=<optimized out>, na=2, n=<optimized out>, pp_stack=0x7fff6a2f9c10, 
    func=0x3b0a410) at Python/ceval.c:3792
#47 call_function (oparg=<optimized out>, pp_stack=0x7fff6a2f9c10) at Python/ceval.c:3727
#48 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389
#49 0x00007f1208b934d9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, 
    args=<optimized out>, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:2968
#50 0x00007f1208b1f7f6 in function_call (func=0x3af28c0, arg=0x588d7a0, kw=0x0) at Objects/funcobject.c:524
#51 0x00007f1208af97a3 in PyObject_Call (func=0x3af28c0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#52 0x00007f1208b0667f in instancemethod_call (func=0x3af28c0, arg=0x588d7a0, kw=0x0)
    at Objects/classobject.c:2579
#53 0x00007f1208af97a3 in PyObject_Call (func=0x4e29be0, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#54 0x00007f1208b545c6 in half_compare (self=<optimized out>, other=<optimized out>) at Objects/typeobject.c:5253
#55 0x00007f1208b547a5 in _PyObject_SlotCompare (self=0x579f908, other=0x58c5528) at Objects/typeobject.c:5278
#56 0x00007f1208b34dad in PyObject_RichCompare (v=0x579f908, w=0x58c5528, op=2) at Objects/object.c:967
#57 0x00007f1208b3505f in PyObject_RichCompareBool (v=<optimized out>, w=<optimized out>, op=<optimized out>)
    at Objects/object.c:1001
#58 0x00007f1208b49264 in tuplerichcompare (op=2, w=0x5898a70, v=0x578bf38) at Objects/tupleobject.c:546
#59 tuplerichcompare (v=0x578bf38, w=0x5898a70, op=2) at Objects/tupleobject.c:517
#60 0x00007f1208b34d71 in PyObject_RichCompare (v=0x578bf38, w=0x5898a70, op=2) at Objects/object.c:958
#61 0x00007f1208b3505f in PyObject_RichCompareBool (v=<optimized out>, w=<optimized out>, op=<optimized out>)
    at Objects/object.c:1001
#62 0x00007f1208b49264 in tuplerichcompare (op=2, w=0x5898ab8, v=0x579c368) at Objects/tupleobject.c:546
#63 tuplerichcompare (v=0x579c368, w=0x5898ab8, op=2) at Objects/tupleobject.c:517
#64 0x00007f1208b34d71 in PyObject_RichCompare (v=0x579c368, w=0x5898ab8, op=2) at Objects/object.c:958
#65 0x00007f1208b3505f in PyObject_RichCompareBool (v=<optimized out>, w=<optimized out>, op=<optimized out>)
    at Objects/object.c:1001
#66 0x00007f1208b2f305 in lookdict (mp=0x14c51d0, key=<optimized out>, hash=-1399715627429533172)
    at Objects/dictobject.c:351
#67 0x00007f1208b3087c in PyDict_DelItem (op=0x14c51d0, key=0x5898ab8) at Objects/dictobject.c:742
#68 0x00007f1208b8e924 in PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:1555
#69 0x00007f1208b934d9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, 
    args=<optimized out>, argcount=1, kws=0x0, kwcount=0, defs=0x1458be8, defcount=1, closure=0x0)
    at Python/ceval.c:2968
#70 0x00007f1208b1f7f6 in function_call (func=0x1464320, arg=0x7f1208fdf510, kw=0x0) at Objects/funcobject.c:524
#71 0x00007f1208af97a3 in PyObject_Call (func=0x1464320, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2492
#72 0x00007f1208afa1e0 in PyObject_CallFunctionObjArgs (callable=0x1464320) at Objects/abstract.c:2723
#73 0x00007f1208bc4146 in handle_weakrefs (old=0x7f1208e52b40, unreachable=0x7fff6a2fa700)
---Type <return> to continue, or q <return> to quit---
    at Modules/gcmodule.c:607
#74 collect (generation=2) at Modules/gcmodule.c:859
#75 0x00007f1208bc4b04 in PyGC_Collect () at Modules/gcmodule.c:1292
#76 0x00007f1208bb6d73 in Py_Finalize () at Python/pythonrun.c:424
#77 0x00007f1208bb5c38 in Py_Exit (sts=0) at Python/pythonrun.c:1714
#78 0x00007f1208bb5d2f in handle_system_exit () at Python/pythonrun.c:1116
#79 0x00007f1208bb5fc5 in handle_system_exit () at Python/pythonrun.c:1078
#80 PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:1126
#81 0x00007f1208bb643e in PyRun_SimpleFileExFlags (fp=<optimized out>, filename=<optimized out>, closeit=1, 
    flags=0x7fff6a2fa9e0) at Python/pythonrun.c:935
#82 0x00007f1208bc35a3 in Py_Main (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:599
#83 0x00007f1207e7569d in __libc_start_main (main=0x400620 <main>, argc=3, ubp_av=0x7fff6a2fab08, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff6a2faaf8)
    at libc-start.c:226
#84 0x0000000000400651 in _start ()

For the record, I enabled coredumps and then ran gdb --core core.4522 local/bin/python

comment:23 Changed 8 years ago by SimonKing

Thank you! How does one enable coredumps?

comment:24 Changed 8 years ago by SimonKing

If I understand correctly, the coredump says that it occurs while doing c = cmp(self.p, x.p), where x and self are residue fields.

comment:25 Changed 8 years ago by SimonKing

Yep, I just inserted a print statement before and after the "cmp" line. When leaving sage, the first line was printed, and the segfault happened before printing the second line. Hence, the problem occurs when comparing fractional ideals.

comment:26 Changed 8 years ago by vbraun

To enable coredumps (at least with bash):

ulimit -c unlimited

comment:27 Changed 8 years ago by SimonKing

Aha! A comparison of two sage.libs.pari.gen.gen happens after _unsafe_deallocate_pari_stack is called, which closes pari. That is, of course, bad.

Only I wonder how the order can be changed. Alternatively, it could be tested before comparison whether pari is still alive. But that would result in a slow-down.

comment:28 Changed 8 years ago by SimonKing

I guess one must make sure that there is a strong reference to the (unique?) pari instance until all sage.libs.pari.gen.gen are deallocated.

comment:29 Changed 8 years ago by SimonKing

pari._unsafe_deallocate_pari_stack is called in sage.all.quit_sage. It does not help to move it to the end of quit_sage. I wonder why it is not put into a proper __del__ method of PariInstance? Is it really needed to be in quit_sage??

comment:30 Changed 8 years ago by SimonKing

Yessss! When removing _unsafe_deallocate_pari_stack from quit_sage and renaming it into a __dealloc__ method, then the segfault vanishes!

comment:31 Changed 8 years ago by SimonKing

Too bad. It fixes the segfault of sage -t sage/rings/number_field/number_field_ideal.py, but it doesn't help for sage -t sage/schemes/elliptic_curves/heegner.py.

Why is it always the elliptic curves code that causes trouble for my patches?

comment:32 Changed 8 years ago by SimonKing

I have attached a second patch, that fixes two or three segfaults - which isn't enough.

comment:33 follow-up: Changed 8 years ago by vbraun

  • Cc jdemeyer added

In Python one must not use dealloc() to free C resources, at least not unless you are absolutely certain that the Python object does not participate in circular references.

Does it help to do an explicit gc.collect() at the end of quit_sage and only then deallocate Pari? If not we might have to give up clearing the Pari stack...

comment:34 in reply to: ↑ 33 Changed 8 years ago by SimonKing

Replying to vbraun:

In Python one must not use dealloc() to free C resources, at least not unless you are absolutely certain that the Python object does not participate in circular references.

Do you mean __del__? If I remember correctly, __dealloc__ is Cython, has nothing to do with the ability of the garbage collector to deal with circular references, and it is what one must have if there are C-resources to free after deleting all Python stuff. So, from all what I know, using __dealloc__ (not __del__!) is a clean solution.

Does it help to do an explicit gc.collect() at the end of quit_sage and only then deallocate Pari? If not we might have to give up clearing the Pari stack...

I didn't try.

comment:35 Changed 8 years ago by vbraun

Yes, you are right: __dealloc__ is ok, __del__ is not.

But the problem seems to be that we finalize Pari before finalizing all Pari elements. Ideally, elements keep their parent alive because they hold a reference but I think GENs are often used in an ad-hoc way in Sage. So moving the Pari finalizer to __dealloc__ just makes it run later, but still gives no guarantees about finalizer ordering.

comment:36 Changed 8 years ago by SimonKing

Pari elements have no parent, if I am not mistaken. Adding a parent means: Create an overhead, namely an additional pointer as part of all Pari elements. I am not sure if the number theorists would like that - one might ask on sage-nt.

comment:37 Changed 8 years ago by vbraun

There is sage.rings.pari_ring which implements parents and elements. But when Pari is used in the Sage library its usually directly via its C API.

Looking at Python's C API, it seems that Py_AtExit() is what we want: A callback for a cleanup function that is run after Python is finalized. In fact anything from quit_sage() that just finalizes a C library should probably be moved there. See http://docs.python.org/c-api/sys.html

comment:38 follow-up: Changed 8 years ago by jdemeyer

Not sure why I was added to "cc". But the newly added doctest in trac12215_segfault_fixes.patch looks bad because there really should be only one running PariInstance, since global variables are used for the PARI stack (this is the fault of PARI, not of Sage).

comment:39 in reply to: ↑ 38 ; follow-up: Changed 8 years ago by SimonKing

Replying to jdemeyer:

But the newly added doctest in trac12215_segfault_fixes.patch looks bad because there really should be only one running PariInstance

How else could one test that __dealloc__ works?

comment:40 in reply to: ↑ 39 ; follow-up: Changed 8 years ago by jdemeyer

Replying to SimonKing:

How else could one test that __dealloc__ works?

By Sage not crashing upon exit. I don't see any other way here.

comment:41 in reply to: ↑ 40 Changed 8 years ago by SimonKing

Replying to jdemeyer:

Replying to SimonKing:

How else could one test that __dealloc__ works?

By Sage not crashing upon exit. I don't see any other way here.

OK. I just thought Sage has a 100% doctest policy.

comment:42 Changed 8 years ago by SimonKing

  • Work issues changed from segfaults for elliptic curves to fix it...

With sage-5.0.prealpha0 plus #11780 plus #715 plus #11521 plus #12290, all tests pass. But if one adds the two patches from here, one gets

        sage -t  -force_lib devel/sage/sage/combinat/combinatorial_algebra.py # 4 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/partition.py # 3 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/kschur.py # 17 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/sfa.py # 284 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/macdonald.py # 107 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/hall_littlewood.py # 61 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/llt.py # 50 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/monomial.py # 16 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/orthotriang.py # 25 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/elementary.py # 9 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/homogeneous.py # 9 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/dual.py # 87 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/schur.py # 13 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/ns_macdonald.py # 2 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/powersum.py # 17 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/classical.py # 9 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/sf/jack.py # 35 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/product_species.py # 1 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/composition_species.py # 2 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/functorial_composition_species.py # 3 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/generating_series.py # 44 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/library.py # 4 doctests failed
        sage -t  -force_lib devel/sage/sage/combinat/species/species.py # 2 doctests failed
        sage -t  -force_lib devel/sage/sage/libs/pari/gen.pyx # Killed/crashed

Hopefully most of these errors have a common root.

comment:43 Changed 8 years ago by SimonKing

It seems that a good deal of the errors comes from a method sage.combinat.sf.sf.SymmetricFunctions.register_isomorphism: It registers a coercion, but this is only possible when no coercion has been established for that object before.

What should one do: Catch the error and 'not registering the coercion? Or wipe the registered coercions, by calling sage.structure.parent.Parent.unset_coercions_used?

comment:44 Changed 8 years ago by SimonKing

I have to slightly modify my preceding statement. The error is raised not if there has been established a coercion for that object before, but if there has been a coercion registered between the two objects before. Anyway, the problem remains the same.

comment:45 Changed 8 years ago by SimonKing

Here is a very short example triggering the error:

sage: P = JackPolynomialsP(QQ,1)
sage: P([2,1])^2

Hopefully this is short enough for debugging - I find it quite mysterious, so far.

comment:46 Changed 8 years ago by SimonKing

  • Cc mhansen added

Something else is interesting: The error changes when repeating it.

sage: P = JackPolynomialsP(QQ,1)
sage: p = P([2,1])
sage: p^2
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (56, 0))

ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.

Traceback (most recent call last):
...
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/combinat/sf/sf.pyc in register_isomorphism(self, morphism)
    324         mathematically wrong, as above. Use with care!
    325         """
--> 326         morphism.codomain().register_coercion(morphism)
    327 
    328     _shorthands = set(['e', 'h', 'm', 'p', 's'])

/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/structure/parent.so in sage.structure.parent.Parent.register_coercion (sage/structure/parent.c:11955)()

/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/structure/parent.so in sage.structure.parent.Parent.register_coercion (sage/structure/parent.c:11889)()

AssertionError: coercion from Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis to Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis already registered or discovered
sage: p^2
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (56, 0))

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
...
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/python2.7/site-packages/sage/combinat/sf/sfa.pyc in _from_cache(self, element, cache_function, cache_dict, **subs_dict)
    631             if sum(part) not in cache_dict:
    632                 cache_function(sum(part))
--> 633             for part2, c2 in cache_dict[sum(part)][part].iteritems():
    634                 c3 = c*c2
    635                 if hasattr(c3,'subs'): # c3 may be in the base ring

KeyError: [2, 1]

Cc to the author of sage/combinat/sf/jack.py.

comment:47 Changed 8 years ago by SimonKing

I inserted some print statement into the register_isomorphism method of symmetric functions. I found that with or without the patch, the isomorphisms are registered both during initialisation of the JackPolynomialsP and before raising an element to a power for the first time:

sage: P = JackPolynomialsP(QQ,1)
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Power symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Schur symmetric functions as basis
registering Symmetric Function Algebra over Rational Field, Power symmetric functions as basis  TO  Symmetric Function Algebra over Rational Field, Monomial symmetric functions as basis
sage: p = P([2,1])
sage: p^2
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Elementary symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Homogeneous symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Schur symmetric functions as basis
registering Symmetric Function Algebra over Integer Ring, Power symmetric functions as basis  TO  Symmetric Function Algebra over Integer Ring, Monomial symmetric functions as basis
JackP[2, 2, 1, 1] + JackP[2, 2, 2] + JackP[3, 1, 1, 1] + 2*JackP[3, 2, 1] + JackP[3, 3] + JackP[4, 1, 1] + JackP[4, 2]
sage: p^2
JackP[2, 2, 1, 1] + JackP[2, 2, 2] + JackP[3, 1, 1, 1] + 2*JackP[3, 2, 1] + JackP[3, 3] + JackP[4, 1, 1] + JackP[4, 2]

This gives rise to some questions:

  • Why are the symmetric functions registering the isomorphisms twice, even without my patch?
  • Why is there no error without my patch? There should be, since double-registration of a coercion is illegal!

I guess, the best solution would be to address the first question: Registering the same thing twice is a waste or resources anyway.

comment:48 Changed 8 years ago by SimonKing

Sorry, I was mistaken! It is not two times the same! The first time it is over the rational field, the second time over the integer ring. So, forget my previous questions.

comment:49 Changed 8 years ago by SimonKing

Now I understand the problem:

SymmetricFunctions is a subclass of UniqueRepresentation. By my patch, UniqueRepresentation is using weak references. Apparently SymmetricFunctions therefore can be garbage collected, but - and now comes the strange point - the coercion system still recalls that a coercion to them has already been registered.

Anyway, with my patch, SymmetricFunctions(ZZ) and SymmetricFunctions?(QQ)` are created repeatedly, and that's bad.

comment:50 Changed 8 years ago by SimonKing

  • Status changed from needs_work to needs_review
  • Work issues fix it... deleted

I updated the second patch, which should solve the problem!!

First of all: The segfault in the tests of sage/libs/pari/gen.pyx was due to my test for the new dealloc method. Following Jeroen's advice, I removed it and stated in the docs that Sage not crashing at exit is an indirect doctest.

Then, all failures in sage/combinat could be fixed by using a strong cache for SymmetricFunctions(...). So, I simply overrode the __classcall__ method inherited from UniqueRepresentation.

I just tested that with the new patch all tests in sage/combinat and sage/libs/pari/gen.pyx pass. The others passed even with the old patch version, so that I am confident that they will pass as well (of course, one must try!).

comment:51 Changed 8 years ago by SimonKing

FWIW, make ptest succeeded.

comment:52 Changed 8 years ago by SimonKing

As I said in my previous post, the tests pass with this patch. The tests also pass with the patch from #12313. However, there are three segfaults that occur when both patches are applied. I have difficulties to trace it down.

comment:53 Changed 8 years ago by SimonKing

  • Cc vbraun added

Cc to Volker, because I expect he has enough knowledge to give me some advice on how I could trace down the following segfault.

With #12313 and the patch from here, sage -t -verbose -force_lib "devel/sage/doc/en/bordeaux_2008/half_integral.rst" segfaults. By inspection of the core file, I found that the segfault occurs during deallocation of a functor.

For debugging, I added a __dealloc__ method to sage.categories.functor.Functor that writes the type and the address of self and of the two cdef attributes __domain and __codomain to some file. The same is done during initialisation of the functor.

And the last lines of the resulting file (before the segfault) are:

Dealloc Functor <type 'sage.structure.coerce_actions.LeftModuleAction'> at 71023056
  Domain <class 'sage.categories.groupoid.Groupoid'> at 75636560
  Codom. <class 'sage.categories.commutative_rings.CommutativeRings'> at 15429144
Dealloc Functor <type 'sage.structure.coerce_actions.LeftModuleAction'> at 71023056
  Domain <type 'NoneType'> at 140661532564960
  Codom. <type 'NoneType'> at 140661532564960

In other words, the functor is deallocated twice, which is a legitimate reason to segfault.

How can I find out why Sage tries to deallocate it twice?

comment:54 follow-ups: Changed 8 years ago by vbraun

Is it actually being finalized twice? To me, it seems that just the malloc bin was reused for a second LeftModuleAction instance. In particular, why would domain and codomain be different in the second destructor call.

comment:55 in reply to: ↑ 54 Changed 8 years ago by SimonKing

Replying to vbraun:

In particular, why would domain and codomain be different in the second destructor call.

Because domain and codomain were deleted the first time. The second time, they already are NoneType.

comment:56 in reply to: ↑ 54 Changed 8 years ago by SimonKing

Replying to vbraun:

Is it actually being finalized twice? To me, it seems that just the malloc bin was reused for a second LeftModuleAction instance.

And I do believe it is the same instance. Namely, if what you say was right, then we should see a call to "init" between the two deallocations (I made both init and dealloc write to the same log file). But the two deallocations followed directly: No initialisation and no other deallocation in between.

comment:57 Changed 8 years ago by SimonKing

No progress on my side. For my project, it probably means that I have to pick between two evils: Either live with the memleak that would be fixed in #12313, or live with the memleak that would be fixed here. Bad.

comment:58 Changed 8 years ago by SimonKing

  • Milestone changed from sage-5.0 to sage-4.8

Now that's weird:

When I define

    def __dealloc__(self):
        if self.__domain is not None:
            Py_INCREF(self.__domain)
        if self.__codomain is not None:
            Py_INCREF(self.__codomain)

for sage.categories.functor.Functor, then the segfault disappears.

Can this be a solution? It looks weird.

comment:59 Changed 8 years ago by SimonKing

  • Milestone changed from sage-4.8 to sage-5.0

comment:60 Changed 8 years ago by SimonKing

  • Description modified (diff)

I have updated the second patch, which was about fixing segfaults anyway.

As I already stated: I find it weird that the problem is solved by incrementing the reference count of the domain and codomain of an action when the action is deallocated. But it works, i.e., the doctests that used to segfault with #12313 and the old version of the patches run fine with the new patch version.

I need an expert opinion, though, and the full test suite is also to be run.

Concerning memleaks, here is the example from the ticket description.

With #12313 and the patches from here:

sage: import sage.structure.unique_representation
sage: len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)
135
sage: 
sage: for i in range(2,1000):
....:         ring = ZZ.quotient(ZZ(i))
....:     vectorspace = ring^2
....: 
sage: import gc
sage: gc.collect()
16641
sage: len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)
227

With #12313 only:

sage: import sage.structure.unique_representation
sage: len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)
151
sage: 
sage: for i in range(2,1000):
....:         ring = ZZ.quotient(ZZ(i))
....:     vectorspace = ring^2
....: 
sage: import gc
sage: gc.collect()
3805
sage: len(sage.structure.unique_representation.UniqueRepresentation.__classcall__.cache)
5142

So, it is a clear progress, and IIRC the patches comprise tests against at least one memory leak that is fixed. Needs review!

Apply trac12215_weak_cached_function.patch trac12215_segfault_fixes.patch

comment:61 Changed 8 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Fix two tests

With sage-5.0.prealpha0 plus #11780, #11290, #715, #11521, #12313 and the patches from here, make ptest results in

        sage -t  -force_lib devel/sage/sage/combinat/sf/sf.py # 1 doctests failed
        sage -t  -force_lib devel/sage/sage/categories/category.py # 1 doctests failed

So, it needs work (because all tests pass when the patches from here are not applied), but it should hopefully be easy to fix.

comment:62 Changed 8 years ago by vbraun

I tried the following in cdef class Action:

    def __cinit__(self):
        print 'Action __cinit__ ' + str(id(self))

    def __dealloc__(self):
        print 'Action __dealloc__ ' + str(id(self))

then I do get occasionally reused id (=memory address in CPython), for example

    Action __cinit__ 105376976
    Action __dealloc__ 105376976
    Action __cinit__ 105376976
    Action __dealloc__ 105376976

But I don't see any double finalizers without the object being constructed in-between. I also don't get any segfault in bordeaux_2008/half_integral.rst.

comment:63 follow-up: Changed 8 years ago by vbraun

For the record, I have these patches applied on top of sage-4.8.rc0:

12221_debug.patch
trac_12247_var_construction.patch
9138_flat.patch
trac11900_category_speedup_combined.patch
trac11900_only_fix_singleton_hash.patch
trac11900_doctest.patch
11115_flat.patch
trac_11115_docfix.patch
trac12215_weak_cached_function.patch
trac12215_segfault_fixes.patch

removed the Py_INCREF(self.__domain) and Py_INCREF(self.__codomain) bandaid. Still no segfault.

comment:64 in reply to: ↑ 63 Changed 8 years ago by SimonKing

Replying to vbraun:

For the record, I have these patches applied on top of sage-4.8.rc0:

12221_debug.patch
trac_12247_var_construction.patch
9138_flat.patch
trac11900_category_speedup_combined.patch
trac11900_only_fix_singleton_hash.patch
trac11900_doctest.patch
11115_flat.patch
trac_11115_docfix.patch
trac12215_weak_cached_function.patch
trac12215_segfault_fixes.patch

removed the Py_INCREF(self.__domain) and Py_INCREF(self.__codomain) bandaid. Still no segfault.

Sure. As I stated in some post above, the segfault only results when applying both #12313 (hence, its dependency #715 as well) and the (old) patches from here.

If you only have the (old or new) patches from here or only have #715+#12313 then there is no segfault.

comment:65 follow-up: Changed 8 years ago by vbraun

I ran all doctests and there are a few crashes in functor.so elsewhere. I didn't have to apply any additional patches. It dies with

Action __cinit__ 84546128
Action __dealloc__ 84546128
Action __cinit__ 84546128
Action __dealloc__ 84546128
Action __cinit__ 84628736
Action __cinit__ 84546128
Action __dealloc__ 84546128
Action __dealloc__ 84546128
/home/vbraun/opt/sage-4.8.rc0/local/lib/libcsage.so(print_backtrace+0x31)[0x7fcc0db1adf6]
/home/vbraun/opt/sage-4.8.rc0/local/lib/libcsage.so(sigdie+0x14)[0x7fcc0db1ae28]
/home/vbraun/opt/sage-4.8.rc0/local/lib/libcsage.so(sage_signal_handler+0x20c)[0x7fcc0db1aa76]

It seems that its just memory corruption that manifests itself by freeing the object twice. But the error is presumably elsewhere. Also the gdb stack trace is completely corrupted.

comment:66 Changed 8 years ago by vbraun

Here is the stack trace:

#0  0x00007ffaadb88511 in __pyx_tp_dealloc_4sage_10categories_7functor_Functor (o=0x63ed250) at sage/categories/functor.c:2845
#1  0x00007ffaad970cc8 in __pyx_tp_dealloc_4sage_10categories_6action_Action (o=0x63ed250) at sage/categories/action.c:5943
#2  0x00007ffaad5485a0 in __pyx_tp_dealloc_4sage_9structure_14coerce_actions_ModuleAction (o=0x63ed250) at sage/structure/coerce_actions.c:7505
#3  0x00007ffabbcf8f0c in type_call (type=<optimized out>, args=0x63e09e0, kwds=0x0) at Objects/typeobject.c:748
#4  0x00007ffabbca27a3 in PyObject_Call (func=0x7ffaad754ec0, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492
#5  0x00007ffaad53ffbb in __pyx_pf_4sage_9structure_14coerce_actions_1detect_element_action (__pyx_self=0x0, __pyx_args=0x63fbb40, __pyx_kwds=0x0)
    at sage/structure/coerce_actions.c:4616
#6  0x00007ffabbca27a3 in PyObject_Call (func=0x2683dd0, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492
#7  0x00007ffaaeb0ea32 in __pyx_f_4sage_9structure_6parent_6Parent_discover_action (__pyx_v_self=0x644ab00, __pyx_v_S=0x6448770, 
    __pyx_v_op=0x7ffab525aea8, __pyx_v_self_on_left=1) at sage/structure/parent.c:16618
#8  0x00007ffaaed48057 in __pyx_f_4sage_9structure_10parent_old_6Parent_get_action_c_impl (__pyx_v_self=0x644ab00, __pyx_v_S=0x6448770, 
    __pyx_v_op=0x7ffab525aea8, __pyx_v_self_on_left=1) at sage/structure/parent_old.c:3312
#9  0x00007ffaaed47ea2 in __pyx_pf_4sage_9structure_10parent_old_6Parent_4get_action_impl (__pyx_v_self=0x644ab00, __pyx_args=0x63fb910, 
    __pyx_kwds=0x0) at sage/structure/parent_old.c:3258
#10 0x00007ffabbca27a3 in PyObject_Call (func=0x636a5a8, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492
#11 0x00007ffaaed46ee7 in __pyx_f_4sage_9structure_10parent_old_6Parent_get_action_c (__pyx_v_self=0x644ab00, __pyx_v_S=0x6448770, 
    __pyx_v_op=0x7ffab525aea8, __pyx_v_self_on_left=1, __pyx_skip_dispatch=0) at sage/structure/parent_old.c:2935
#12 0x00007ffaaed4f19d in __pyx_f_4sage_9structure_10parent_old_6Parent__get_action_ (__pyx_v_self=0x644ab00, __pyx_v_other=0x6448770, 
    __pyx_v_op=0x7ffab525aea8, __pyx_v_self_on_left=1, __pyx_skip_dispatch=0) at sage/structure/parent_old.c:6228
#13 0x00007ffaaeb0b17c in __pyx_f_4sage_9structure_6parent_6Parent_get_action (__pyx_v_self=0x644ab00, __pyx_v_S=0x6448770, __pyx_skip_dispatch=0, 
    __pyx_optional_args=0x7fff38b8e2f0) at sage/structure/parent.c:15635
#14 0x00007ffaae1fa2e6 in __pyx_f_4sage_9structure_6coerce_24CoercionModel_cache_maps_discover_action (__pyx_v_self=0x26286d0, __pyx_v_R=0x644ab00, 
    __pyx_v_S=0x6448770, __pyx_v_op=0x7ffab525aea8, __pyx_skip_dispatch=0) at sage/structure/coerce.c:12473
#15 0x00007ffaae1f6564 in __pyx_f_4sage_9structure_6coerce_24CoercionModel_cache_maps_get_action (__pyx_v_self=0x26286d0, __pyx_v_R=0x644ab00, 
    __pyx_v_S=0x6448770, __pyx_v_op=0x7ffab525aea8, __pyx_skip_dispatch=0) at sage/structure/coerce.c:11424
#16 0x00007ffaae1e64e2 in __pyx_f_4sage_9structure_6coerce_24CoercionModel_cache_maps_bin_op (__pyx_v_self=0x26286d0, __pyx_v_x=0x6354b48, 
    __pyx_v_y=0x63e36b0, __pyx_v_op=0x7ffab525aea8, __pyx_skip_dispatch=0) at sage/structure/coerce.c:6583
#17 0x00007ffaae448f03 in __pyx_pf_4sage_9structure_7element_6Vector_1__mul__ (__pyx_v_left=0x6354b48, __pyx_v_right=0x63e36b0)
    at sage/structure/element.c:16130
#18 0x00007ffabbc9dc5f in binary_op1 (v=0x6354b48, w=0x63e36b0, op_slot=16) at Objects/abstract.c:917
#19 0x00007ffabbca0cc8 in PyNumber_Multiply (v=0x6354b48, w=0x63e36b0) at Objects/abstract.c:1188
#20 0x00007ffa9be33b68 in __pyx_f_4sage_5rings_13residue_field_12ReductionMap__call_ (__pyx_v_self=0x63f10e8, __pyx_v_x=0x6405108, 
    __pyx_skip_dispatch=0) at sage/rings/residue_field.c:8140

within coercion_model.bin_op() (frame 17) there are calls to Python methods (PyObject_Call), and in there the garbage collector is free to run. I suspect that this is what is happening somewhere...

comment:67 in reply to: ↑ 65 Changed 8 years ago by SimonKing

Replying to vbraun:

I ran all doctests and there are a few crashes in functor.so elsewhere. I didn't have to apply any additional patches.

What exactly do you mean? Do you have the old patches from here applied (i.e., without the new __dealloc__ method), or does the segfault even occur with the new patches?

Is it normal that both you and me see segfaults, and it seems to be analogous problems (namely double deallocation), but we see it in different examples and with different patches (namely, even with the old patches from here, all tests pass for me)?

It dies with ... It seems that its just memory corruption that manifests itself by freeing the object twice.

So, you can confirm that it is the same object.

But the error is presumably elsewhere. Also the gdb stack trace is completely corrupted.

That sounds like one should write a complete log of all python code executed - according to your suggestion that the error somewhere occurs during a Python method.

comment:68 Changed 8 years ago by SimonKing

Here is some more info on the segfault.

Setting: I have sage-5.0.prealpha0 plus #11780, #11290, #715, #11521, #12313 and the patches from here, removing the __dealloc__ method introduced by the last patch.

The segfault is triggered by doing

sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 3, 10)
[]
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 5, 10)
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/libcsage.so(print_backtrace+0x31)[0x7fe047add9c6]
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/libcsage.so(sigdie+0x14)[0x7fe047add9f8]
/home/simon/SAGE/sage-5.0.prealpha0/local/lib/libcsage.so(sage_signal_handler+0x20c)[0x7fe047add646]
/lib64/libpthread.so.0(+0xfd00)[0x7fe04cd80d00]
...

When I revert the lines, that's to say, if I do

sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 5, 10)
[q - 2*q^3 - 2*q^5 + 4*q^7 - q^9 + O(q^10)]
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 3, 10)
[]
sage: quit
Exiting Sage (CPU time 0m2.02s, Wall time 0m20.49s).

**********************************************************************

Oops, Sage crashed. We do our best to make it stable, but...

A crash report was automatically generated with the following information:
  - A verbatim copy of the crash traceback.
  - A copy of your input history during this session.
  - Data on your current Sage configuration.

It was left in the file named:
        '/home/simon/.sage/ipython/Sage_crash_report.txt'
If you can email this file to the developers, the information in it will help
them in understanding and correcting the problem.

You can mail it to: sage-support at sage-support@googlegroups.com
with the subject 'Sage Crash Report'.

If you want to do it now, the following command will work (under Unix):
mail -s 'Sage Crash Report' sage-support@googlegroups.com < /home/simon/.sage/ipython/Sage_crash_report.txt

To ensure accurate tracking of this issue, please file a report about it at:
http://trac.sagemath.org/sage_trac

Press enter to exit:

I was tracing all python commands for the first variant of the segfault. The last few lines of the log are as follows:

sage.categories.pushout:__call__:2125         if self.p == other.p:
sage.categories.pushout:__call__:2126             from sage.all import Infinity
sage.categories.pushout:__call__:2127             if self.prec == other.prec:
sage.categories.pushout:__call__:2128                 extras = self.extras.copy()
sage.categories.pushout:__call__:3102     except CoercionException:
sage.categories.pushout:__call__:3104     except (TypeError, ValueError, AttributeError, NotImplementedError), ex:
sage.categories.pushout:__call__:3108         raise CoercionException(ex)
weakref:__call__:49             self = selfref()
weakref:__call__:50             if self is not None:
weakref:__call__:51                 del self.data[wr.key]
sage.rings.power_series_ring:__call__:556         s = "Power Series Ring in %s over %s"%(self.variable_name(), self.base_ring())
sage.rings.power_series_ring:__call__:557         if self.is_sparse():
sage.rings.power_series_ring:__call__:562         return self.__is_sparse
sage.rings.power_series_ring:__call__:559         return s

So, indeed it seems that the problem has something to do with weak references. There is an item of a weak value dictionary deleted right before segfaulting.

To do: Find out what item of what dictionary is deleted, why it is deleted, and how deletion can be prevented.

comment:69 Changed 8 years ago by SimonKing

I was also tracing the deletion of items of weak value dictionaries: I was writing the key to a log file whenever an item was deleted.

Already when starting sage, we see that the same key (and presumably the same value as well) is deleted repeatedly:

...
((<class 'sage.categories.category.JoinCategory'>, (Category of semirings, Category of infinite enumerated sets)), ())
((<class 'sage.categories.groupoid.Groupoid'>, Integer Ring), ())
((<class 'sage.categories.groupoid.Groupoid'>, Rational Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Rational Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Rational Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Complex Lazy Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Rational Field), ())

When issuing the first line of the crashing example and repeating it, we see something like

...
((<class 'sage.categories.groupoid.Groupoid'>, Complex Lazy Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Complex Lazy Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Cyclotomic Field of order 4 and degree 2), ())
((<class 'sage.matrix.matrix_space.MatrixSpace'>, Rational Field, 0, 0, False), ())
((<class 'sage.matrix.matrix_space.MatrixSpace'>, Rational Field, 10, 0, False), ())
((5, 0, 'prealpha0'), (Rational Field, 0, False, None))
((<class 'sage.matrix.matrix_space.MatrixSpace'>, Rational Field, 0, 0, False), ())
((<class 'sage.matrix.matrix_space.MatrixSpace'>, Rational Field, 10, 0, False), ())

And at crashing, one has

((<class 'sage.matrix.matrix_space.MatrixSpace'>, Ring of integers modulo 46337, 4, 10, False), ())
((<class 'sage.categories.vector_spaces.VectorSpaces'>, Ring of integers modulo 46337), ())
((<class 'sage.matrix.matrix_space.MatrixSpace'>, Integer Ring, 4, 10, False), ())
((<class 'sage.matrix.matrix_space.MatrixSpace'>, Rational Field, 4, 10, False), ())
((<class 'sage.categories.groupoid.Groupoid'>, Power Series Ring in q over Rational Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Power Series Ring in q over Rational Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Power Series Ring in q over Rational Field), ())
((<class 'sage.categories.groupoid.Groupoid'>, Power Series Ring in q over Integer Ring), ())

Conclusion:

The occurring keys indicate that the deletions occur in UniqueRepresentation. While using weak references for UniqueRepresentation fixes memory leaks, it seems that far too often stuff is removed that would actually still be needed. Certainly it is bad for speed, and it seems that it is also responsible for the segmentation faults.

I am not sure how that problem should best be addressed.

comment:70 Changed 8 years ago by SimonKing

  • Work issues changed from Fix two tests to Fix a coercion problem in sage.combinat.sf.sf

I think I have not properly stated that with the latest patches applied to sage-5.0.prealpha0, the segfault is gone. However, at least when I also have a couple of other tickets (#11780, #12290, #715. #11521, #12313, #12357, #12351, #7797), I get one coercion error in sage.combinat.sf.sf.

To be precise, I do not get that error when I only have all the other patches. So, it really seems to be caused by the patches from here. Trying to track it down...

comment:71 Changed 8 years ago by SimonKing

That's odd. The failing test is from the __call__ method in sage.combinat.sf.sf. When I execute things in the command line, I get the following:

sage: Sym = SymmetricFunctions(QQ[x])
sage: p = Sym.p(); s = Sym.s()
sage: P = p[1].parent()
sage: S = s[1].parent()
sage: P.coerce_map_from(S)
Generic morphism:
  From: Symmetric Function Algebra over Univariate Polynomial Ring in x over Rational Field, Schur symmetric functions as basis
  To:   Symmetric Function Algebra over Univariate Polynomial Ring in x over Rational Field, Power symmetric functions as basis
sage: S.coerce_map_from(P)
Generic morphism:
  From: Symmetric Function Algebra over Univariate Polynomial Ring in x over Rational Field, Power symmetric functions as basis
  To:   Symmetric Function Algebra over Univariate Polynomial Ring in x over Rational Field, Schur symmetric functions as basis

However, when the same is executed as a doctest, then there is no coercion map between S and P. Could it be that some other doctest is messing with the coercion maps, and my patch (perhaps in combination with #715 and #11521) reveals it?

comment:72 Changed 8 years ago by SimonKing

  • Dependencies changed from #11115 #11900 to #11115 #11900 #12645
  • Status changed from needs_work to needs_review
  • Work issues Fix a coercion problem in sage.combinat.sf.sf deleted

That's even odder. With #11780, #12290, #715. #11521, #12313, #12357, #12351, #7797 and #12645 (so, adding #12645, which only changes the rst markup in sage/combinat/sf/sf.py), all tests in sage/combinat pass.

Anyway. Since the second patch is in conflict with #12645 anyway, I am rebasing it. Since the doctest error has vanished, I put it back to "needs review", even though I wish I knew what was the reason for the temporary problem.

comment:73 Changed 8 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to coercion in symmetric function algebras

Bad. Meanwhile I work on top of sage-5.0.beta7. This time, it is the first patch that creates a coercion error in sage/combinat/sf/sf.py. Needs work.

comment:74 Changed 8 years ago by SimonKing

Even worse: After applying related tickets (#715, #11521, #12313, #12357) to sage-5.0.beta13, 16 out of 18 hunks fail to apply. So, I need to find out where the problem comes from.

comment:75 Changed 8 years ago by SimonKing

  • Dependencies changed from #11115 #11900 #12645 to #11115 #11900 #12645 #11599
  • Work issues changed from coercion in symmetric function algebras to Rebase wrt #11599. Coercion in symmetric function algebras

It comes from #11599, which fixes the same docstring misformattings that I fix in my patch as well...

comment:76 Changed 8 years ago by SimonKing

Arrgh. With #715, #11521, #12313, #11943, #11935, #12357 and #7797 on top of sage-5.0.beta13, all tests pass. But adding the (rebased) patch from here, I get failures in

        sage -t  -force_lib "devel/sage/sage/structure/coerce_dict.pyx"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/macdonald.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/llt.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/jack.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/kschur.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/hall_littlewood.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/sfa.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/multiplicative.py"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/schur.py"
        sage -t  -force_lib "devel/sage/sage/combinat/species/library.py"
        sage -t  -force_lib "devel/sage/sage/combinat/combinatorial_algebra.py"
        sage -t  -force_lib "devel/sage/sage/categories/homset.py"

That's not good.

comment:77 Changed 8 years ago by SimonKing

Oops, I had only the first of the two patches from here applied. Nevertheless, it doesn't look good.

comment:78 Changed 8 years ago by SimonKing

  • Work issues changed from Rebase wrt #11599. Coercion in symmetric function algebras to Coercion in symmetric function algebras

I have rebased the first patch relative to #11599.

With both patches, one "only" has errors in

        sage -t  -force_lib "devel/sage/sage/structure/coerce_dict.pyx"
        sage -t  -force_lib "devel/sage/sage/combinat/sf/sf.py"
        sage -t  -force_lib "devel/sage/sage/categories/homset.py"

So, it still needs work, but it is less bad than I thought...

Apply trac12215_weak_cached_function.patch trac12215_segfault_fixes.patch

comment:79 Changed 8 years ago by SimonKing

A bit more detail: The tests in coerce_dict.pyx and homset.py fail even if only the first patch is applied. But the tests in sf.py pass if only the first patch is applied.

comment:80 Changed 8 years ago by SimonKing

  • Work issues changed from Coercion in symmetric function algebras to Keep the fix from #12313. Coercion in symmetric function algebras

I tested whether the problem comes from the combination of this ticket with #12357. But it turns out that the following test

        sage: K = GF(1<<55,'t')
        sage: for i in range(50):
        ...     a = K.random_element()
        ...     E = EllipticCurve(j=a)
        ...     P = E.random_point()
        ...     Q = 2*P
        sage: import gc
        sage: n = gc.collect()
        sage: from sage.schemes.elliptic_curves.ell_finite_field import EllipticCurve_finite_field
        sage: LE = [x for x in gc.get_objects() if isinstance(x, EllipticCurve_finite_field)]
        sage: len(LE)    # indirect doctest
        1

still fails. The test has been introduced in #12313. And of course it is not acceptable that #12313 makes a memory leak disappear, but #12215 makes it show up again.

comment:81 Changed 8 years ago by SimonKing

I think I located the problem. By some patch, I had introduced a weak dictionary in sage.structure.factory. But somehow I managed to remove the corresponding hunk from the patch. Now, I need to find out where that has happened...

comment:82 Changed 8 years ago by SimonKing

Aha! It turns out that I introduced the WeakValueDictionary in the first patch from here, but somehow I managed to delete it. Now the leak remains fixed, the patch is updated.

Apply trac12215_weak_cached_function.patch trac12215_segfault_fixes.patch

Last edited 8 years ago by SimonKing (previous) (diff)

comment:83 Changed 8 years ago by SimonKing

With the updated version of the first patch (applied on top of #715, #11521, #12313, #11943 and #11935), the tests in sage/structure/coerce_dict and sage/categories/homset pass.

There remains the problem with symmetric functions, but this is due to the second patch...

comment:84 Changed 8 years ago by SimonKing

  • Owner changed from rlm to (none)

What exactly is the problem?

It is

            sage: S = SymmetricFunctions(ZZ)
            sage: S.inject_shorthands()
            doctest:...: RuntimeWarning: redefining global value `e`
            doctest:...: RuntimeWarning: redefining global value `m`
            sage: s[1] + e[2] * p[1,1] + 2*h[3] + m[2,1]
            s[1] - 2*s[1, 1, 1] + s[1, 1, 1, 1] + s[2, 1] + 2*s[2, 1, 1] + s[2, 2] + 2*s[3] + s[3, 1]

The last line fails with an error when doctesting, but works fine when doing the same in an interactive session.

comment:85 Changed 8 years ago by SimonKing

The failure is really strange. If one does

sage: S = SymmetricFunctions(ZZ)
sage: S.inject_shorthands()
sage: e.has_coerce_map_from(m)

on the command line, then one gets the answer "True". Doing the same in a separate doctest, one still gets "True". But doing the same in line 384 of sage.combinat.sf.sf.py, one gets "False". So, there seems to be a nasty diffcult-to-debug side effect, which apparently was introduced by the second patch.

comment:86 Changed 8 years ago by SimonKing

The error disappears if one does not override that __classcall__ method of symmetric function algebras. However, by the first ticket, it uses a weak cache, which results in many errors elsewhere...

But if I recall correctly, there has been a recent ticket dealing with coercion for symmetric functions. Perhaps a miracle occurs and the strongly cached custom __classcall__ can be cancelled (count the words that start with "c"...)?

comment:87 Changed 8 years ago by SimonKing

How unfortunate. If I remove the custom (strongly cached) __classcall__ of symmetric function algebras, I get

The following tests failed:


	sage -t  -force_lib "devel/sage/sage/combinat/sf/macdonald.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/llt.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/jack.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/kschur.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/hall_littlewood.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/classical.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/sfa.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/elementary.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/multiplicative.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/schur.py"
	sage -t  -force_lib "devel/sage/sage/combinat/sf/homogeneous.py"
	sage -t  -force_lib "devel/sage/sage/combinat/species/library.py"
	sage -t  -force_lib "devel/sage/sage/combinat/combinatorial_algebra.py"

But if one has a custom strong cache for symmetric function algebras, then one has the single failure in

	sage -t -force_lib "devel/sage/sage/combinat/sf/sf.py"

comment:88 Changed 8 years ago by SimonKing

Hooray! It turns out that one can fix the failing doctest in sf.py by not only providing a strongly cached sf.SymmetricFunctions.__classcall__, but by additionally providing a strongly cached sfa.SymmetricFunctionAlgebra_generic.__classcall__.

It is a bit unfortunate that a strong cache creeps back in, but apparently the assumption of strong caching is extensively used in sage.combinat.sf. Having weakly cached unique representation everywhere except in sage.combinat.sf is at least something...

I am now running the full testsuite with the new modification.

comment:89 Changed 8 years ago by SimonKing

  • Status changed from needs_work to needs_review
  • Work issues Keep the fix from #12313. Coercion in symmetric function algebras deleted

Now the second patch has been updated as well. As I have announced, the second patch is now not only introducing a strong custom cache for SymmetricFunctions, but also for the different representations, i.e., for SymmetricFunctionAlgebra_generic. It is certainly not ideal that symmetric functions need to be strongly cached, but I think that there may be a different ticket to resolve this special case.

Anyway, with sage-5.0.beta13 plus #715, #11521, #12313, #11943, #11935 and the two patches from here, all doctests pass. I am confident that they would also pass when one only has 5.0.beta13 plus the two patches from here.

Needs review, then!

comment:90 follow-up: Changed 8 years ago by SimonKing

  • Dependencies changed from #11115 #11900 #12645 #11599 to #11115 #11900 #12645 #11599 #12215
  • Status changed from needs_review to needs_work
  • Work issues set to rebase wrt #12808

There is a conflict with #12808, which has a positive review. Hence, we need to rebase it!

comment:91 in reply to: ↑ 90 Changed 8 years ago by nthiery

Replying to SimonKing:

There is a conflict with #12808, which has a positive review. Hence, we need to rebase it!

(Trivial) rebase done in the updated patch (at the end of the day, I needed it urgently, so I just took over the task).

comment:92 Changed 8 years ago by SimonKing

  • Dependencies changed from #11115 #11900 #12645 #11599 #12215 to #11115 #11900 #12645 #11599 #12808
  • Work issues rebase wrt #12808 deleted

Thank you, Nicolas! I planned to do the change today.

comment:93 Changed 8 years ago by SimonKing

  • Status changed from needs_work to needs_review

... and I guess it needs review, again?

comment:94 Changed 7 years ago by SimonKing

  • Dependencies changed from #11115 #11900 #12645 #11599 #12808 to #11115 #11900 #12645 #11599 #12808 #7980
  • Status changed from needs_review to needs_work
  • Work issues set to Rebase wrt #7980

Anne mentioned on combinat-devel that the patch needs to be rebased because of #7980.

comment:95 Changed 7 years ago by SimonKing

  • Description modified (diff)
  • Status changed from needs_work to needs_review

The patches are now rebased rel #7980. I have not been able to replace my first patch, because trac believes that it isn't my patch but Nicolas'...

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:96 Changed 7 years ago by SimonKing

  • Work issues Rebase wrt #7980 deleted

comment:97 Changed 7 years ago by SimonKing

I don't know how, but I managed to mess up the patch. Now it should work.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:98 Changed 7 years ago by SimonKing

When applying

trac12969_fix_coercion_cache.patch
trac12215_weak_cached_function-sk.patch
trac12215_segfault_fixes.patch
trac_715_combined.patch
trac_11521_homset_weakcache_combined.patch
trac_12313-mono_dict-combined-random-sk.patch

on top of sage-5.2.rc0, all tests on bsd.math pass.

comment:99 Changed 7 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Rebase rel sage-5.3.beta2

Alas. The second patch does not apply to sage-5.3.beta2. Needs work.

comment:100 Changed 7 years ago by SimonKing

Apparently it is a very severe conflict with sage/combinat/sf/sf.py. Bad, apparently that file has totally changed. I can't even recognise where the patch was supposed to be applied.

comment:101 Changed 7 years ago by SimonKing

  • Status changed from needs_work to needs_review
  • Work issues Rebase rel sage-5.3.beta2 deleted

Perhaps the changes to sage/combinat/sf/sf.py are not needed? I have replaced the second patch by one that does not touch sf.py (in particular, it does not introduce a strong cache to symmetric function algebras). If that won't work, I can still put the necessary changes into a third patch...

For the moment:

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:102 Changed 7 years ago by SimonKing

Arrgh. The first patch needs two little changes in the documentation. There were two lines in a doc string that started with an indentation, which was a typo.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:103 follow-up: Changed 7 years ago by nbruin

Hi Simon,

I've read the patches to see if there is anything weird or questionable. It looks good for the most part. Just some things you might be able to comment on:

main patch:

sage/categories/category.py

You make some changes to _join, which has the note

It is used for getting a temporary speed-up at trac ticket #11900. But it is supposed to be replaced by a better solution at trac ticket #11943.

and #11943 was merged in 5.1! Is this function still used?

Otherwise: Looks good!

sefault_fixes:

sage/combinat/sf/sf.py

There is still a change to that file, namely, you add "from sage.misc.cachefunc import cached_function"

which is an odd thing to do in one patch by itself. Perhaps it's a change you failed to revert?

INCREF:

scary stuff ... I don't like black magic in code, but you seem to know what you're doing...

Conclusion:

I'd be close to giving a positive review. I haven't run test suits (yet), but you've done a lot of that and the bot should be doing that anyway. The changes themselves look fairly straightforward.

comment:104 in reply to: ↑ 103 Changed 7 years ago by SimonKing

Hi Nils,

Replying to nbruin:

main patch:

sage/categories/category.py

You make some changes to _join, which has the note

It is used for getting a temporary speed-up at trac ticket #11900. But it is supposed to be replaced by a better solution at trac ticket #11943.

and #11943 was merged in 5.1! Is this function still used?

Thank you for spotting it. The plan was to replace it, but apparently it wasn't done. So, I removed the note.

sefault_fixes:

sage/combinat/sf/sf.py

There is still a change to that file, namely, you add "from sage.misc.cachefunc import cached_function"

which is an odd thing to do in one patch by itself. Perhaps it's a change you failed to revert?

I have removed the import. I also removed the strongly cached classcall from sage/combinat/sf/sfa.py: it turns out that a strong cache for symmetric function algebras is not needed any more, after the recent overhaul.

INCREF:

scary stuff ...

I know. I am now testing whether it it still needed.

Anyway, it should now be ready for review. I think, to be on the safe side, we should use #11521 as a dependency, which meanwhile has a positive review (thank you!).

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:105 Changed 7 years ago by SimonKing

  • Dependencies changed from #11115 #11900 #12645 #11599 #12808 #7980 to #11115 #11900 #12645 #11599 #12808 #7980 #11521

comment:106 Changed 7 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Do not INCREF when deallocating

Great! The scary INCREF apparently is not needed!! All tests in sage/schemes and sage/sf pass. The following two examples previously resulted in a segfault, but now they don't:

sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 3, 10)
[]
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 5, 10)
[q - 2*q^3 - 2*q^5 + 4*q^7 - q^9 + O(q^10)]
sage: quit
Exiting Sage (CPU time 0m0.36s, Wall time 0m32.82s).
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 5, 10)
[q - 2*q^3 - 2*q^5 + 4*q^7 - q^9 + O(q^10)]
sage: half_integral_weight_modform_basis(DirichletGroup(16,QQ).1, 3, 10)
[]
sage: quit
Exiting Sage (CPU time 0m0.29s, Wall time 0m4.68s).

So, I will (a bit later today) update the segfault patch, fortunately removing the odd INCREF.

comment:107 Changed 7 years ago by SimonKing

  • Status changed from needs_work to needs_review
  • Work issues Do not INCREF when deallocating deleted

The INCREF is gone!

Note that in the commit message of the old patch version, I mention a segfault that occurs in combination with #12313. But I think - if the segfault still occurs - it should be dealt with there.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:108 follow-up: Changed 7 years ago by nbruin

Are you sure you produced these patches against a clean 5.3beta1? Your attachment:trac12215_segfault_fixes.patch still has a (trivial) hunk for sage/combinat/sf/sfa.py which fails to apply for me. If that were the only problem you could just remove that hunk, but I'm also getting quite some doctest failures along the lines of comment:42 (but the pari test is fine now!)

I confirm that I'm not seeing any of the problems that necessitated the INCREF hack, so you're good for that. I'm doubtful that the combinatorics stuff is fixed, though ...

comment:109 in reply to: ↑ 108 Changed 7 years ago by SimonKing

Replying to nbruin:

Are you sure you produced these patches against a clean 5.3beta1?

I am sure that I produced these patches against 5.3.beta2 (not beta1). I don't know if Jeroen did some last minute changes to beta2; I downloaded it two days ago.

comment:110 Changed 7 years ago by SimonKing

  • Dependencies changed from #11115 #11900 #12645 #11599 #12808 #7980 #11521 to #11115 #11900 #12645 #11599 #12808 #7980 #11521 #5457

I guess #5457 needs to be added as a dependency. It rewrites the symmetric function code, and was merged in beta2.

comment:111 follow-up: Changed 7 years ago by SimonKing

The patchbot reports:

sage -t  -force_lib devel/sage-12215/sage/structure/coerce.pyx
**********************************************************************
File "/opt/patchbot-5.3.beta2/devel/sage-12215/sage/structure/coerce.pyx", line 179:
    sage: cm.get_stats()
Expected:
    ((0, 1.0, 4), (0, 0.25, 1))
Got:
    ((0, 1.0, 4), (0, 0.0, 0))

Since I do not get that error, it seems that the distribution of data into the buckets is machine dependent. Hence, I suggest to mark that test as random.

comment:112 in reply to: ↑ 111 Changed 7 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Fix one doc test

Replying to SimonKing:

Since I do not get that error,

Wrong! I do get the same error. Hence, needs work.

comment:113 Changed 7 years ago by SimonKing

  • Status changed from needs_work to needs_review
  • Work issues Fix one doc test deleted

I fixed the doc test (by modifying the second patch).

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:114 Changed 7 years ago by nbruin

Good! indeed, with the new dependency things apply cleanly.

sage/structure/coerce.pyx

Agreed. Things get placed in buckets based on address. I'd expect that to depend on not just the machine but even on the run!

sage/combinat/integer_vectors_mod_permgroup.py

That one times out for me too. With --verbose I get that all 271 tests pass but then it hangs on

A workspace appears to have been corrupted... automatically rebuilding (this is harmless).

which explains a timeout. That might just been my setup, but since one of the bots experienced a similar problem, perhaps this needs attention.

I've seen some startup time failures on sage.math, but the machine is rather loaded. I don't think any of the #5457, #12215 patches are individually responsible for slowdowns.

So at this point, it's just technical bits -- nothing intrinsic about the code proposed here.

comment:115 Changed 7 years ago by SimonKing

When I last changed the second patch, I modified the expected result of some cm.get_stats() test. However, I forgot to mark the result "random". You seem to agree that it is indeed random (depending on memory addresses). Hence, I just changed the second patch accordingly.

Apply trac12215_weak_cached_function-sk.patch trac12215_segfault_fixes.patch

comment:116 Changed 7 years ago by nbruin

Bot's only complaint:

sage -t  -force_lib devel/sage-12215/sage/sat/boolean_polynomials.py # Killed/crashed

I cannot confirm this error on a clean 5.3b2 patched up to #12313. It pretty much has to be some assert failing in polybori, because I think that's the only component called from there.

comment:117 Changed 7 years ago by SimonKing

The log says:

sage -t  -force_lib devel/sage-12215/sage/sat/boolean_polynomials.py
The doctested process was killed by signal 6

Has this anything to do with my patch? What is signal 6?

comment:118 Changed 7 years ago by nbruin

  • Reviewers set to Nils Bruin
  • Status changed from needs_review to positive_review

comment:119 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.3 to sage-5.4

comment:120 Changed 7 years ago by jdemeyer

  • Dependencies changed from #11115 #11900 #12645 #11599 #12808 #7980 #11521 #5457 to #11521

comment:121 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.4 to sage-pending

comment:122 Changed 7 years ago by SimonKing

  • Dependencies changed from #11521 to #13447

There were problems with #715+#11521 on OS X. It is solved by #13447. Hence, moving the dependency...

comment:123 Changed 7 years ago by jdemeyer

Are the dependencies of this ticket correct, does it really depend on #13447?

comment:124 Changed 7 years ago by nbruin

  • Dependencies changed from #13447 to #11521

No patches were changed, only the dependencies. One of the patches on #715 ensures that the problem that #13447 solves doesn't occur, so this ticket should be safe while only depending on #715, #11521.

comment:125 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-pending to sage-5.5

comment:126 Changed 7 years ago by jdemeyer

When applying this on top of sage-5.5.beta0 (not released), I get doctest failures in sage/crypto/mq. I'll investigate further and report back.

comment:127 follow-up: Changed 7 years ago by nbruin

I [edit:] cannot quite confirm that

sage -t  "devel/sage-main/sage/crypto/mq/mpolynomialsystem.py"

on sage-5.5.beta0+#12215 on sage.math does not succeed. In fact, when I run it, I usually get "Connection to sage.math.washington.edu closed by remote host.". A screen session doesn't help either, since that gets killed too.

It seems to hang somewhere around line 971 (that's the last I see from sage -t --verbose), which is in the doctesting of "coefficient_matrix". Running that doctest by itself doesn't cause any problems.

Oops, I had one run now where my connection wasn't closed! I got a " TIMED OUT! PROCESS KILLED!" this time.

And in fact, the "connection closed" thing seems to happen quite a bit, so I don't think I have confirmation that there's really a bug. It may be that sage.math is just flaky.

Last edited 7 years ago by nbruin (previous) (diff)

comment:128 in reply to: ↑ 127 ; follow-up: Changed 7 years ago by jdemeyer

Replying to nbruin:

And in fact, the "connection closed" thing seems to happen quite a bit, so I don't think I have confirmation that there's really a bug. It may be that sage.math is just flaky.

Maybe your ssh program is flaky?

comment:129 Changed 7 years ago by jdemeyer

Rebased patch because it applied with fuzz.

comment:130 Changed 7 years ago by jdemeyer

Irrelevant remark: you might replace trac ticket 12215 in documentation by :trac:`12215`.

comment:131 in reply to: ↑ 128 ; follow-up: Changed 7 years ago by nbruin

OK, the "connection lost" problem was resolved by rm -rf ~/.sage. I don't know what was screwed up, but something there was making *any* sage session very prone to terminating the whole connection.

It seems that line 971 in mpolynomialsystem.py is indeed a problematic doctest. It seems to hang. When I run the doctest in gdb I can interrupt and get a traceback (first bit):

#0  0x00007fb5259748aa in PyObject_Free (p=0x57d0300) at Objects/obmalloc.c:969
#1  0x00007fb525985dcc in PyTuple_ClearFreeList () at Objects/tupleobject.c:916
#2  0x00007fb525a0d5cb in collect (generation=2) at Modules/gcmodule.c:792
#3  0x00007fb525a0d87e in _PyObject_GC_Malloc (basicsize=<value optimized out>) at Modules/gcmodule.c:996
#4  0x00007fb525a0d92d in _PyObject_GC_New (tp=0x7fb525c801a0) at Modules/gcmodule.c:1467
#5  0x00007fb52595e64c in PyList_New (size=0) at Objects/listobject.c:142
#6  0x00007fb51c6c47e5 in __pyx_pw_4sage_9structure_11coerce_dict_10TripleDict_1__init__ (__pyx_v_self=0xf7abc50, 
    __pyx_args=<value optimized out>, __pyx_kwds=<value optimized out>) at sage/structure/coerce_dict.c:1440
#7  0x00007fb5259885a8 in type_call (type=<value optimized out>, args=0xf714d0, kwds=0x0) at Objects/typeobject.c:735
#8  0x00007fb52592c308 in PyObject_Call (func=0x7fb51c8d1620, arg=0xf714d0, kw=0x0) at Objects/abstract.c:2529
#9  0x00007fb51cf2d550 in __pyx_f_4sage_9structure_6parent_6Parent_init_coerce (__pyx_v_self=0x4acacf0, 
    __pyx_optional_args=<value optimized out>) at sage/structure/parent.c:5757
#10 0x00007fb51d17176b in __pyx_f_4sage_9structure_10parent_old_6Parent_init_coerce (__pyx_v_self=0x57d0300, 
    __pyx_optional_args=<value optimized out>) at sage/structure/parent_old.c:1638

so it seems TripleDict is implicated. I've tried it a couple of times and the tracebacks are not completely identical all the time, but the collect (generation=2) always seems to be there. So either the system gets stuck in a loop in which it is spending a large percentage of the time in collect or somehow collect itself gets caught in an infinite loop. I guess instrumenting TripleDict? to see what it's chewing on is most likely to resolve this one (anything below seems python library).

comment:132 Changed 7 years ago by nbruin

Of course this is another heisenbug: If I take the doctesting python file that gets produced, ~/.sage/tmp/mpolynomialsystem_*.py and run that through python via

 $ ./sage -sh
 $ python ~/.sage/tmp/mpolynomialsystem_*.py

the doctest succeeds, which is weird, because that is exactly what sage-doctest is supposed to run too.

comment:133 Changed 7 years ago by nbruin

FWIW, with #12313 doctests pass, so perhaps we should just merge them together?

comment:134 Changed 7 years ago by jdemeyer

  • Dependencies changed from #11521 to #11521, merge with #12313

comment:135 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.5 to sage-5.6

comment:136 follow-up: Changed 7 years ago by SimonKing

  • Dependencies changed from #11521, merge with #12313 to #11521, #13741, merge with #12313
  • Status changed from positive_review to needs_work
  • Work issues set to rebase rel #13741

This had a positive review, however I think one can not realistically expect it will soon go in: It depends on other tickets, that have a tendency to uncover nasty problems.

Hence, I suggest to cut it into smaller pieces - one of them being the Pari deallocation, that is now the new dependency #13741. The second patch thus needs to be rebased.

comment:137 in reply to: ↑ 136 Changed 7 years ago by nthiery

  • Status changed from needs_work to needs_review
  • Work issues rebase rel #13741 deleted

Replying to SimonKing:

Hence, I suggest to cut it into smaller pieces - one of them being the Pari deallocation, that is now the new dependency #13741. The second patch thus needs to be rebased.

I just did that rebase for the sage-combinat queue (by removing the two relevant hunks), uploaded the updated patch here and tentatively set this ticket back to needs review.

comment:138 Changed 7 years ago by jpflori

  • Cc jpflori added

Changed 7 years ago by SimonKing

Implement a weak version of cached_function, and use it for UniqueRepresentation. Properly use WeakValueDictionary in UniqueFactory. Combined patch

comment:139 Changed 7 years ago by SimonKing

  • Description modified (diff)

I have combined the two patches into one. I haven't tested it yet, but will do in Sage's debug version.

Apply trac12215_weak_cached_function_combined.patch

comment:140 Changed 7 years ago by SimonKing

FWIW: In sage-5.6.rc0 built with SAGE_DEBUG=yes (see #13864) plus #12215 plus #12313, all doctests pass on my x86_64 openSuse 12.1 laptop with MALLOC_CHECK_=3.

comment:141 in reply to: ↑ 131 Changed 7 years ago by nbruin

  • Status changed from needs_review to positive_review

Replying to nbruin:

It seems that line 971 in mpolynomialsystem.py is indeed a problematic doctest. It seems to hang.

That behaviour is entirely consistent with a double free (and hence a circular freelist) that we solved in #13896. So, back to positive review from me!

comment:142 Changed 7 years ago by jdemeyer

  • Dependencies changed from #11521, #13741, merge with #12313 to merge with #12313 and #13378
  • Milestone changed from sage-5.6 to sage-5.7

comment:143 Changed 7 years ago by jdemeyer

  • Dependencies changed from merge with #12313 and #13378 to merge with #12313
  • Milestone changed from sage-5.7 to sage-pending

comment:144 follow-up: Changed 7 years ago by nbruin

  • Dependencies merge with #12313 deleted

We only introduced a codependence on #12313 because of comment 133. In view of comment 141 I suspect the source of the problem noted in comment 131 was actually solved, rather than hidden by merging tickets together.

Hence, I propose that this ticket is considered without dependencies and be considered for merging in sage-5.7 anyway.

comment:145 in reply to: ↑ 144 ; follow-up: Changed 7 years ago by jdemeyer

  • Milestone changed from sage-pending to sage-5.7

Replying to nbruin:

We only introduced a codependence on #12313 because of comment 133. In view of comment 141 I suspect the source of the problem noted in comment 131 was actually solved, rather than hidden by merging tickets together.

Hence, I propose that this ticket is considered without dependencies and be considered for merging in sage-5.7 anyway.

Just to have a second opinion: Simon, do you agree?

comment:146 in reply to: ↑ 145 Changed 7 years ago by SimonKing

Replying to jdemeyer:

Replying to nbruin:

We only introduced a codependence on #12313 because of comment 133. In view of comment 141 I suspect the source of the problem noted in comment 131 was actually solved, rather than hidden by merging tickets together.

Hence, I propose that this ticket is considered without dependencies and be considered for merging in sage-5.7 anyway.

Just to have a second opinion: Simon, do you agree?

Yes. Note also comment:52: It used to be the case that both #12215 and #12313 were fine separately, but problems occurred when they were used together. But in later patch versions or Sage versions, it was observed that having them together makes the Heisenbug magically disappear - and the suggestion to merge them together was based on this observation.

Now it very much seems that the Cython upgrade was enough to fix the crashes. We should of course verify that no problems occur when #12215 is merged without #12313, but I think there is no reason to merge #12215 and #12313 together.

comment:147 follow-up: Changed 7 years ago by jdemeyer

  • Status changed from positive_review to needs_work

With #12215+#13378 but without #12313:

sage -t  -force_lib devel/sage/sage/schemes/elliptic_curves/heegner.py
------------------------------------------------------------------------
/release/merger/sage-5.7.beta0/local/lib/libcsage.so(print_backtrace+0x2b)[0x2b9501a4093d]
/release/merger/sage-5.7.beta0/local/lib/libcsage.so(sigdie+0x34)[0x2b9501a40ae4]
/release/merger/sage-5.7.beta0/local/lib/libcsage.so(sage_signal_handler+0x15b)[0x2b9501a40317]
/lib/libpthread.so.0[0x2b94ffa697d0]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/coerce_dict.so[0x2b9508ca14b6]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/coerce_dict.so[0x2b9508ca1ee6]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_CallFunctionObjArgs+0x161)[0x2b94ff6be631]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_ClearWeakRefs+0x44a)[0x2b94ff728b4a]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/category_object.so[0x2b9508656a38]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff719e01]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff716681]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff716681]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6fd66b]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff719e5c]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/categories/functor.so[0x2b95099ab54e]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6ee447]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6ee447]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/coerce_dict.so[0x2b9508c9a967]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff79d13d]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(_PyObject_GC_Malloc+0xee)[0x2b94ff79d89e]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(_PyObject_GC_New+0xd)[0x2b94ff79d94d]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyDict_New+0xcd)[0x2b94ff6fcffd]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/libs/pari/gen.so[0x2b950b87c3e9]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/libs/pari/gen.so[0x2b950b8df1da]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/libs/pari/gen.so[0x2b950b870eaa]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6ecb3e]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6f156d]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6f1b0b]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff7185a8]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/libs/pari/gen.so[0x2b950b873c3e]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/rings/polynomial/polynomial_rational_flint.so[0x2b9514ecad42]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/rings/polynomial/polynomial_rational_flint.so[0x2b9514ecf271]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff7185a8]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x37bf)[0x2b94ff76103f]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6e99b9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/coerce_maps.so[0x2b950fe80b58]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/parent.so[0x2b9508434031]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/rings/number_field/number_field_element.so[0x2b95183388a5]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff7177dc]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x56)[0x2b94ff75cb26]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6d8b53]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/rings/number_field/number_field_element_quadratic.so[0x2b95185aa867]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff7185a8]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1299)[0x2b94ff75eb19]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6e99b9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/coerce_maps.so[0x2b950fe80b58]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/parent.so[0x2b9508434031]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x56)[0x2b94ff75cb26]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff75a238]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5de2)[0x2b94ff763662]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6e99b9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/rings/residue_field.so[0x2b951cd4a7b5]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/factory.so[0x2b951431ab1d]
/release/merger/sage-5.7.beta0/local/lib/python/site-packages/sage/structure/factory.so[0x2b9514317940]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1299)[0x2b94ff75eb19]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x69c5)[0x2b94ff764245]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x2b94ff765472]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x714e)[0x2b94ff7649ce]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6e99b9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1299)[0x2b94ff75eb19]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6e99b9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1299)[0x2b94ff75eb19]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6e99b9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0[0x2b94ff6cc8bf]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyObject_Call+0x68)[0x2b94ff6bc308]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1299)[0x2b94ff75eb19]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5ae4)[0x2b94ff763364]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x852)[0x2b94ff765352]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x2b94ff765472]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyRun_FileExFlags+0xc1)[0x2b94ff7891f1]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0x1f9)[0x2b94ff7894c9]
/release/merger/sage-5.7.beta0/local/lib/libpython2.7.so.1.0(Py_Main+0xb15)[0x2b94ff79c115]
/lib/libc.so.6(__libc_start_main+0xf4)[0x2b950031e1f4]
python[0x400679]

comment:148 in reply to: ↑ 147 Changed 7 years ago by SimonKing

Replying to jdemeyer:

With #12215+#13378 but without #12313:

sage -t  -force_lib devel/sage/sage/schemes/elliptic_curves/heegner.py

Ouch. Well, I hope I can reproduce it in Sage-5.6.rc0 debug version.

comment:149 Changed 7 years ago by SimonKing

Fortunately I can confirm it (at least with MALLOC_CHECK_=3). I'm running it now under gdb.

comment:150 Changed 7 years ago by SimonKing

What I get is:

Program received signal SIGSEGV, Segmentation fault.
0x00007fffedeb9f4c in __pyx_pf_4sage_9structure_11coerce_dict_16TripleDictEraser_2__call__ (__pyx_v_self=0x73982d0, __pyx_v_r=0x7fffea2aaf00) at sage/structure/coerce_dict.c:1107
1107      __pyx_t_10 = PyList_GET_ITEM(__pyx_t_1, (__pyx_v_h % PyList_GET_SIZE(__pyx_t_4)));
(gdb) bt
#0  0x00007fffedeb9f4c in __pyx_pf_4sage_9structure_11coerce_dict_16TripleDictEraser_2__call__ (__pyx_v_self=0x73982d0, __pyx_v_r=0x7fffea2aaf00) at sage/structure/coerce_dict.c:1107
#1  0x00007fffedeb9592 in __pyx_pw_4sage_9structure_11coerce_dict_16TripleDictEraser_3__call__ (__pyx_v_self=0x73982d0, __pyx_args=0x75bfd10, __pyx_kwds=0x0) at sage/structure/coerce_dict.c:966
#2  0x00007ffff79be33e in PyObject_Call (func=0x73982d0, arg=0x75bfd10, kw=0x0) at Objects/abstract.c:2529
#3  0x00007ffff79bf059 in PyObject_CallFunctionObjArgs (callable=0x73982d0) at Objects/abstract.c:2760
#4  0x00007ffff7a64194 in handle_callback (ref=0x7fffea2aaf00, callback=0x73982d0) at Objects/weakrefobject.c:881
#5  0x00007ffff7a645e9 in PyObject_ClearWeakRefs (object=0x90c07b0) at Objects/weakrefobject.c:965
#6  0x00007fffee53af5b in __pyx_tp_dealloc_4sage_9structure_15category_object_CategoryObject (o=0x90c07b0) at sage/structure/category_object.c:8990
#7  0x00007fffee7e6fe0 in __pyx_tp_dealloc_4sage_9structure_6parent_Parent (o=0x90c07b0) at sage/structure/parent.c:21519
#8  0x00007fffeea2aa7f in __pyx_tp_dealloc_4sage_9structure_10parent_old_Parent (o=0x90c07b0) at sage/structure/parent_old.c:7261
#9  0x00007fffeec3bca8 in __pyx_tp_dealloc_4sage_9structure_11parent_base_ParentWithBase (o=0x90c07b0) at sage/structure/parent_base.c:1876
#10 0x00007fffead58cbc in __pyx_tp_dealloc_4sage_9structure_11parent_gens_ParentWithGens (o=0x90c07b0) at sage/structure/parent_gens.c:5865
#11 0x00007ffff7a4ce4c in subtype_dealloc (self=0x90c07b0) at Objects/typeobject.c:1014
#12 0x00007ffff7a27be4 in _Py_Dealloc (op=0x90c07b0) at Objects/object.c:2243
#13 0x00007ffff7a480b0 in tupledealloc (op=0x845ab50) at Objects/tupleobject.c:220
#14 0x00007ffff7a27be4 in _Py_Dealloc (op=0x845ab50) at Objects/object.c:2243
#15 0x00007ffff7a480b0 in tupledealloc (op=0x7fffea2a0760) at Objects/tupleobject.c:220
#16 0x00007ffff7a27be4 in _Py_Dealloc (op=0x7fffea2a0760) at Objects/object.c:2243
#17 0x00007ffff7a18e8e in dict_dealloc (mp=0x9052b70) at Objects/dictobject.c:985
#18 0x00007ffff7a27be4 in _Py_Dealloc (op=0x9052b70) at Objects/object.c:2243
#19 0x00007ffff7a4cd73 in subtype_dealloc (self=0x85045a0) at Objects/typeobject.c:999
#20 0x00007ffff7a27be4 in _Py_Dealloc (op=0x85045a0) at Objects/object.c:2243
#21 0x00007fffed102313 in __pyx_tp_dealloc_4sage_10categories_7functor_Functor (o=0x7985950) at sage/categories/functor.c:3209
#22 0x00007fffecee64a1 in __pyx_tp_dealloc_4sage_10categories_6action_Action (o=0x7985950) at sage/categories/action.c:6461
#23 0x00007fffbcd614ea in __pyx_tp_dealloc_4sage_6matrix_6action_MatrixMulAction (o=0x7985950) at sage/matrix/action.c:4724
#24 0x00007ffff7a27be4 in _Py_Dealloc (op=0x7985950) at Objects/object.c:2243
#25 0x00007ffff7a02c8e in list_dealloc (op=0x759fa38) at Objects/listobject.c:309
#26 0x00007ffff7a27be4 in _Py_Dealloc (op=0x759fa38) at Objects/object.c:2243
#27 0x00007ffff7a02c8e in list_dealloc (op=0x7964858) at Objects/listobject.c:309
#28 0x00007ffff7a27be4 in _Py_Dealloc (op=0x7964858) at Objects/object.c:2243
#29 0x00007fffedecec13 in __pyx_tp_clear_4sage_9structure_11coerce_dict_TripleDict (o=0x3cbd8d0) at sage/structure/coerce_dict.c:5921
#30 0x00007ffff7b1378b in delete_garbage (collectable=0x7fffffff30f0, old=0x7ffff7dc1540 <generations+96>) at Modules/gcmodule.c:769
#31 0x00007ffff7b13d04 in collect (generation=2) at Modules/gcmodule.c:930
#32 0x00007ffff7b13f06 in collect_generations () at Modules/gcmodule.c:996
#33 0x00007ffff7b14bcc in _PyObject_GC_Malloc (basicsize=264) at Modules/gcmodule.c:1457
#34 0x00007ffff7b14c04 in _PyObject_GC_New (tp=0x7ffff7d9c5a0 <PyDict_Type>) at Modules/gcmodule.c:1467
#35 0x00007ffff7a16cc7 in PyDict_New () at Objects/dictobject.c:277
#36 0x00007fffeaa83860 in __pyx_f_4sage_4libs_4pari_3gen_12PariInstance_new_ref (__pyx_v_self=0xcceae0, __pyx_v_g=0xa968360, __pyx_v_parent=0xa8d8748) at sage/libs/pari/gen.c:49228
#37 0x00007fffea9fb417 in __pyx_pf_4sage_4libs_4pari_3gen_3gen_80__getitem__ (__pyx_v_self=0xa8d8748, __pyx_v_n=0x61f7f0) at sage/libs/pari/gen.c:8638
#38 0x00007fffea9f63b7 in __pyx_pw_4sage_4libs_4pari_3gen_3gen_81__getitem__ (__pyx_v_self=0xa8d8748, __pyx_v_n=0x61f7f0) at sage/libs/pari/gen.c:7643
#39 0x00007fffeaa9a688 in __pyx_sq_item_4sage_4libs_4pari_3gen_gen (o=0xa8d8748, i=1) at sage/libs/pari/gen.c:55757
#40 0x00007ffff79bcdd7 in PySequence_GetItem (s=0xa8d8748, i=1) at Objects/abstract.c:1989
#41 0x00007ffff7a01934 in iter_iternext (iterator=0xa99c300) at Objects/iterobject.c:58
#42 0x00007ffff7a04abe in listextend (self=0xa7db060, b=0xa8d8748) at Objects/listobject.c:872
#43 0x00007ffff7a08ad9 in list_init (self=0xa7db060, args=0xa7a3920, kw=0x0) at Objects/listobject.c:2458
#44 0x00007ffff7a4c1ad in type_call (type=0x7ffff7d9a3c0 <PyList_Type>, args=0xa7a3920, kwds=0x0) at Objects/typeobject.c:737
#45 0x00007ffff79be33e in PyObject_Call (func=0x7ffff7d9a3c0 <PyList_Type>, arg=0xa7a3920, kw=0x0) at Objects/abstract.c:2529
#46 0x00007fffea9eb6f1 in __pyx_pf_4sage_4libs_4pari_3gen_3gen_12list (__pyx_v_self=0xa8d87d0) at sage/libs/pari/gen.c:4507
#47 0x00007fffea9eb4a0 in __pyx_pw_4sage_4libs_4pari_3gen_3gen_13list (__pyx_v_self=0xa8d87d0, unused=0x0) at sage/libs/pari/gen.c:4455
#48 0x00007ffff7a21156 in PyCFunction_Call (func=0xa85b858, arg=0x7ffff7f90060, kw=0x0) at Objects/methodobject.c:90
#49 0x00007ffff79be33e in PyObject_Call (func=0xa85b858, arg=0x7ffff7f90060, kw=0x0) at Objects/abstract.c:2529
#50 0x00007fffe14ff0aa in __pyx_pf_4sage_5rings_10polynomial_25polynomial_rational_flint_25Polynomial_rational_flint_6__init__ (__pyx_v_self=0xa781258, __pyx_v_parent=0x135f730, __pyx_v_x=0xa8d87d0, 
    __pyx_v_check=0x7ffff7d89ec0 <_Py_TrueStruct>, __pyx_v_is_gen=0x7ffff7d89e80 <_Py_ZeroStruct>, __pyx_v_construct=0x7ffff7d89e80 <_Py_ZeroStruct>)
    at sage/rings/polynomial/polynomial_rational_flint.cpp:5760
#51 0x00007fffe14fc966 in __pyx_pw_4sage_5rings_10polynomial_25polynomial_rational_flint_25Polynomial_rational_flint_7__init__ (__pyx_v_self=0xa781258, __pyx_args=0xa6f97d0, __pyx_kwds=0xa83c050)
    at sage/rings/polynomial/polynomial_rational_flint.cpp:5165
#52 0x00007ffff7a4c1ad in type_call (type=0x7fffe174bc20 <__pyx_type_4sage_5rings_10polynomial_25polynomial_rational_flint_Polynomial_rational_flint>, args=0xa6f97d0, kwds=0xa83c050)
    at Objects/typeobject.c:737
#53 0x00007ffff79be33e in PyObject_Call (func=0x7fffe174bc20 <__pyx_type_4sage_5rings_10polynomial_25polynomial_rational_flint_Polynomial_rational_flint>, arg=0xa6f97d0, kw=0xa83c050)
    at Objects/abstract.c:2529
#54 0x00007ffff7ac8282 in ext_do_call (func=0x7fffe174bc20 <__pyx_type_4sage_5rings_10polynomial_25polynomial_rational_flint_Polynomial_rational_flint>, pp_stack=0x7fffffff3a98, flags=2, na=4, nk=1)
    at Python/ceval.c:4334
#55 0x00007ffff7ac1a8b in PyEval_EvalFrameEx (f=0xa663520, throwflag=0) at Python/ceval.c:2705
#56 0x00007ffff7ac420b in PyEval_EvalCodeEx (co=0x7fffe303bb40, globals=0x10a78b0, locals=0x0, args=0xa8acf10, argcount=2, kws=0x0, kwcount=0, defs=0x7fffe17556e8, defcount=4, closure=0x0)
    at Python/ceval.c:3253
#57 0x00007ffff79fd447 in function_call (func=0x7fffdfc76ae0, arg=0xa8acee8, kw=0x0) at Objects/funcobject.c:526
#58 0x00007ffff79be33e in PyObject_Call (func=0x7fffdfc76ae0, arg=0xa8acee8, kw=0x0) at Objects/abstract.c:2529
#59 0x00007ffff79da359 in instancemethod_call (func=0x7fffdfc76ae0, arg=0xa8acee8, kw=0x0) at Objects/classobject.c:2578
#60 0x00007ffff79be33e in PyObject_Call (func=0x7ffff0cdf060, arg=0x7fffea521990, kw=0x0) at Objects/abstract.c:2529
#61 0x00007fffe6888d3b in __pyx_f_4sage_9structure_11coerce_maps_24DefaultConvertMap_unique__call_ (__pyx_v_self=0x7fffbcf7f3f0, __pyx_v_x=0xa8d87d0, __pyx_skip_dispatch=0)
    at sage/structure/coerce_maps.c:3485
#62 0x00007fffee7acda1 in __pyx_pf_4sage_9structure_6parent_6Parent_28__call__ (__pyx_v_self=0x135f730, __pyx_v_x=0xa8d87d0, __pyx_v_args=0x7ffff7f90060, __pyx_v_kwds=0xa751480)
    at sage/structure/parent.c:7415
#63 0x00007fffee7ac0a4 in __pyx_pw_4sage_9structure_6parent_6Parent_29__call__ (__pyx_v_self=0x135f730, __pyx_args=0xa80fc30, __pyx_kwds=0x0) at sage/structure/parent.c:7096
#64 0x00007ffff79be33e in PyObject_Call (func=0x135f730, arg=0xa80fc30, kw=0x0) at Objects/abstract.c:2529
#65 0x00007fffddd720a3 in __pyx_pf_4sage_5rings_12number_field_20number_field_element_18NumberFieldElement_2__init__ (__pyx_v_self=0xa860400, __pyx_v_parent=0x331b730, __pyx_v_f=0xa8d87d0)
    at sage/rings/number_field/number_field_element.cpp:6090
#66 0x00007fffddd6d545 in __pyx_pw_4sage_5rings_12number_field_20number_field_element_18NumberFieldElement_3__init__ (__pyx_v_self=0xa860400, __pyx_args=0xa85eab0, __pyx_kwds=0x0)
    at sage/rings/number_field/number_field_element.cpp:5340
#67 0x00007ffff7a595f6 in wrap_init (self=0xa860400, args=0xa85eab0, 
    wrapped=0x7fffddd6d316 <__pyx_pw_4sage_5rings_12number_field_20number_field_element_18NumberFieldElement_3__init__(PyObject*, PyObject*, PyObject*)>, kwds=0x0) at Objects/typeobject.c:4719
#68 0x00007ffff79e2145 in wrapper_call (wp=0xa99c220, args=0xa85eab0, kwds=0x0) at Objects/descrobject.c:998
#69 0x00007ffff79be33e in PyObject_Call (func=0xa99c220, arg=0xa85eab0, kw=0x0) at Objects/abstract.c:2529
#70 0x00007ffff7ac6404 in PyEval_CallObjectWithKeywords (func=0xa99c220, arg=0xa85eab0, kw=0x0) at Python/ceval.c:3890
#71 0x00007ffff79e1194 in wrapperdescr_call (descr=0x7fffde6d9ae0, args=0xa85eab0, kwds=0x0) at Objects/descrobject.c:306
#72 0x00007ffff79be33e in PyObject_Call (func=0x7fffde6d9ae0, arg=0xa881360, kw=0x0) at Objects/abstract.c:2529
#73 0x00007fffddaddacf in __pyx_pf_4sage_5rings_12number_field_30number_field_element_quadratic_28NumberFieldElement_quadratic___init__ (__pyx_v_self=0xa860400, __pyx_v_parent=0x331b730, 
    __pyx_v_f=0xa8d87d0) at sage/rings/number_field/number_field_element_quadratic.cpp:3893
#74 0x00007fffddadbbdb in __pyx_pw_4sage_5rings_12number_field_30number_field_element_quadratic_28NumberFieldElement_quadratic_1__init__ (__pyx_v_self=0xa860400, __pyx_args=0xa85ca38, __pyx_kwds=0x0)
    at sage/rings/number_field/number_field_element_quadratic.cpp:3386
#75 0x00007ffff7a4c1ad in type_call (type=0x7fffddd15020 <__pyx_type_4sage_5rings_12number_field_30number_field_element_quadratic_NumberFieldElement_quadratic>, args=0xa85ca38, kwds=0x0)
    at Objects/typeobject.c:737
#76 0x00007ffff79be33e in PyObject_Call (func=0x7fffddd15020 <__pyx_type_4sage_5rings_12number_field_30number_field_element_quadratic_NumberFieldElement_quadratic>, arg=0xa85ca38, kw=0x0)
    at Objects/abstract.c:2529
#77 0x00007ffff7ac7bc3 in do_call (func=0x7fffddd15020 <__pyx_type_4sage_5rings_12number_field_30number_field_element_quadratic_NumberFieldElement_quadratic>, pp_stack=0x7fffffff4a10, na=2, nk=0)
    at Python/ceval.c:4239
#78 0x00007ffff7ac6efc in call_function (pp_stack=0x7fffffff4a10, oparg=2) at Python/ceval.c:4044
#79 0x00007ffff7ac17f9 in PyEval_EvalFrameEx (f=0x33b1e00, throwflag=0) at Python/ceval.c:2666
#80 0x00007ffff7ac71f4 in fast_function (func=0x7fffddd47450, pp_stack=0x7fffffff4d90, n=2, na=2, nk=0) at Python/ceval.c:4107
#81 0x00007ffff7ac6ee0 in call_function (pp_stack=0x7fffffff4d90, oparg=1) at Python/ceval.c:4042
#82 0x00007ffff7ac17f9 in PyEval_EvalFrameEx (f=0x37a3270, throwflag=0) at Python/ceval.c:2666
#83 0x00007ffff7ac420b in PyEval_EvalCodeEx (co=0x7fffde989ca0, globals=0x1108b20, locals=0x0, args=0xa850f10, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#84 0x00007ffff79fd447 in function_call (func=0x7fffddd43108, arg=0xa850ee8, kw=0x0) at Objects/funcobject.c:526
#85 0x00007ffff79be33e in PyObject_Call (func=0x7fffddd43108, arg=0xa850ee8, kw=0x0) at Objects/abstract.c:2529
#86 0x00007ffff79da359 in instancemethod_call (func=0x7fffddd43108, arg=0xa850ee8, kw=0x0) at Objects/classobject.c:2578
#87 0x00007ffff79be33e in PyObject_Call (func=0x7fffc2ecc360, arg=0xa70d1b0, kw=0x0) at Objects/abstract.c:2529
#88 0x00007fffe6888d3b in __pyx_f_4sage_9structure_11coerce_maps_24DefaultConvertMap_unique__call_ (__pyx_v_self=0x7fffbcd4b780, __pyx_v_x=0xa8d87d0, __pyx_skip_dispatch=0)
    at sage/structure/coerce_maps.c:3485
#89 0x00007fffee7acda1 in __pyx_pf_4sage_9structure_6parent_6Parent_28__call__ (__pyx_v_self=0x331b730, __pyx_v_x=0xa8d87d0, __pyx_v_args=0x7ffff7f90060, __pyx_v_kwds=0xa9d32c0)
    at sage/structure/parent.c:7415
#90 0x00007fffee7ac0a4 in __pyx_pw_4sage_9structure_6parent_6Parent_29__call__ (__pyx_v_self=0x331b730, __pyx_args=0xa7a3a70, __pyx_kwds=0x0) at sage/structure/parent.c:7096
#91 0x00007ffff79be33e in PyObject_Call (func=0x331b730, arg=0xa7a3a70, kw=0x0) at Objects/abstract.c:2529
#92 0x00007ffff7ac6404 in PyEval_CallObjectWithKeywords (func=0x331b730, arg=0xa7a3a70, kw=0x0) at Python/ceval.c:3890
#93 0x00007ffff7ab2344 in builtin_map (self=0x0, args=0xa8dde70) at Python/bltinmodule.c:1038
#94 0x00007ffff7a210f2 in PyCFunction_Call (func=0x7ffff7f52060, arg=0xa8dde70, kw=0x0) at Objects/methodobject.c:81
#95 0x00007ffff7ac6cdc in call_function (pp_stack=0x7fffffff5900, oparg=2) at Python/ceval.c:4021
#96 0x00007ffff7ac17f9 in PyEval_EvalFrameEx (f=0x3944d10, throwflag=0) at Python/ceval.c:2666
...

I see a couple of familiar names in the backtrace...

Changed 7 years ago by SimonKing

Crash log

comment:151 Changed 7 years ago by SimonKing

Thanks to Volker's enhanced backtraces, running the test in verbose mode and without gdb yields this backtrace, and the crash occurs here (line 6467 of heegner.py)

        sage: E = EllipticCurve('681b')
        sage: I = E.heegner_index(-8); I

Unfortunately, running this in an interactive session works just fine.

comment:152 Changed 7 years ago by SimonKing

Got it, I think.

The crash happens in the last line of the following snippet

  __pyx_t_1 = __pyx_v_self->D->buckets;
  __Pyx_INCREF(__pyx_t_1);
  __pyx_t_4 = __pyx_v_self->D->buckets;
  __Pyx_INCREF(__pyx_t_4);
  __pyx_t_10 = PyList_GET_ITEM(__pyx_t_1, (__pyx_v_h % PyList_GET_SIZE(__pyx_t_4)));

and according to the crash log, we have

        __pyx_t_1 = 0x7f3db87dcb00 <_Py_NoneStruct>

Hence, again, we have the problem that some attributes have already become invalid. I think this was fixed by the second patch from #12313.

Suggestion: In order to keep things modular, the part of the second #12313 patch that applies to TripleDict shall be moved here, so that #12215 remains independent of #12313. And then, the second patch of #12313 should be replaced by something that only takes care of the new MonoDict.

Rationale: #12313 has a problem with a time regression, while #12215 should (hopefully) be fine after installing the fix.

Changed 7 years ago by SimonKing

Safer callback in TripleDictEraser

comment:153 Changed 7 years ago by SimonKing

  • Description modified (diff)
  • Status changed from needs_work to needs_review

The patch's up, and it fixes the crash in heegner.py (tested in sage-5.6.rc0 debug version with MALLOC_CHECK_=3)

Apply trac12215_weak_cached_function_combined.patch trac12215_safe_callback.patch

comment:154 Changed 7 years ago by nbruin

  • Status changed from needs_review to positive_review

Yes, this solves the problem here as well, so positive review.

It looks like the analysis on #12313:226 and the patch that followed from it was based on this ticket. I probably pulled the non-raw patches for #12313 when I tested ... Should we factor out a utility from the Patchbot to pull and apply patches given a ticket number?

Happy to see this work did find some use after all. Again, I believe that in the future, when TripleDictEraser holds a weakref to its dictionary, this won't be necessary anymore, because the weakref will be broken before attributes on the dictionary get erased.

That enhanced traceback (including cython code!) is extremely cool. A big thanks to Volker for making that happen. With that traceback, you only only have to stare at the traceback to diagnose this problem.

comment:155 Changed 7 years ago by SimonKing

Just for the record: All tests pass on my openSuse laptop in the debug version of sage-5.6.rc0+#13878+#13378+#12215, with MALLOC_CHECK_=3.

comment:156 Changed 7 years ago by jdemeyer

  • Merged in set to sage-5.7.beta1
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.