Opened 9 years ago

Closed 8 years ago

#11339 closed defect (fixed)

Refcounting for Singular rings

Reported by: gagern Owned by: drkirkby
Priority: critical Milestone: sage-4.7.2
Component: algebra Keywords: sd31
Cc: fbissey, robertwb, burcin, malb, SimonKing, leif Merged in: sage-4.7.2.alpha4
Authors: Volker Braun, Martin von Gagern Reviewers: François Bissey, Steven Trogdon, Martin Albrecht
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #10903 (for doctests) Stopgaps:

Description (last modified by vbraun)

Ref counting for Singular rings

Python makes no guarantees that destructors are called if circular references are involved. This patch implements an extra Python proxy layer that correctly refcounts Singular rings. In contrast to our earlier code, it will release the singular rings once they are no longer in use. This triggers various issues in Sage/libSingular that are dealt with in the follow-up ticket #10903

Historic discussion

As described for the Sage on Gentoo project, there is a problem in how parts of sage use __deallocate__ in their Cython code. I've actually traced an instance of groebner_strategy.pyx causing segmentation faults under Python 2.7.

What happens is that the garbage collector invokes the tp_clear function for the object. Its implementation is provided by Cython, and one of its effects is that every python reference will be set to None. A bit later on, tp_dealloc is called and invokes the __deallocate__ method. By that time, _parent is None, so accessing _parent._ring is a bad idea, and in this case it turns out to be null.

Others have had similar problems before. There are crude workarounds floating around. I guess a proper solution would be twaking cython to allow custom code in the tp_clear function. In other words, have a "magic" mehod __clear__ similar to the magic __deallocate__. But I'll wait for comments here first, before taking this to cython upstream. Perhaps people with more experience have better solutions to offer. And I won't mind if you decide to take this to Cython devs yourselves.

Apply

Apply:

Attachments (8)

bug11339a.patch (2.6 KB) - added by gagern 9 years ago.
Workaround patch
bug11339a.bundle (1005 bytes) - added by gagern 9 years ago.
Workaround hg bundle
trac_11339_illegal_use_of__deallocate__in_libsingular.patch (2.4 KB) - added by vbraun 8 years ago.
Initial patch
ell_curve_isogeny.segfault.bz2 (1.3 KB) - added by strogdon 8 years ago.
Segmentation fault associated with
ell_curve_isogeny.gdb.bz2 (2.8 KB) - added by strogdon 8 years ago.
ell_curve_isogeny.py debug/backtrace
remove_polys.patch (14.2 KB) - added by vbraun 8 years ago.
Remove some initialization form ell_curve_isogeny.py for debugging.
trac_11339_refcount_singular_rings.patch (11.7 KB) - added by vbraun 8 years ago.
Updated patch
trac_11339_refcount_singular_polynomials.patch (65.3 KB) - added by vbraun 8 years ago.
Updated patch

Download all attachments as: .zip

Change History (103)

comment:1 Changed 9 years ago by fbissey

  • Cc fbissey robertwb added

comment:2 Changed 9 years ago by vbraun

The documentation is fairly clear, you are only allowed to touch C data structures in __dealloc__: http://docs.cython.org/src/userguide/special_methods.html#finalization-method-dealloc

Really, this shows the difference between the C++ and the Python object model. In Python, executing constructors/destructors in derived classes is optional and leaves a lot of freedom to the programmer. Because of the garbage collection you can allocate resources wherever you want. By comparison, C++ is very strict and essentially forces you to use RAII. This is why there is no try/finally in C++. Which is why there is necessarily some tension in Cython between C++ and Python object instantiation/finalization.

Having said that, it seems like the cleanest solution would be if Cython would also call the Python destructor __del__ according to the normal Python rules (why __clear__?). Then the Python and C++ object model could just peacefully coexist. The lifetime of a cdef class would be

__cinit__()    # always called
__init__()     # not necessarily called for derived classes
... do something ...
__del__()      # not necessarily called for derived classes
__dealloc__()  # always called

comment:3 Changed 9 years ago by fbissey

Thanks for your input Volker. I think we should definitely put some work in libs/singular to have some del and dealloc functions. I think the current state of things in there is a bit messy. I remember reading about del in the python doc about garbage collection and wondering why singular_ring_delete in libs/singular/rings.pyx wasn't a del method instead as its intent is clear.

comment:4 Changed 9 years ago by vbraun

Just to clarify, right now Cython extension classes do not call the Python destructor __del__ upon finalization, so Sage can't rely on this mechanism. I'm not quite sure if it is an intentional thing or if the Cython devs just haven't gotten around to implementing it. Maybe we can hack on that on Sage Days 31 since both Robert and Burcin will be around :-)

comment:5 Changed 9 years ago by fbissey

OK, well something as to be done about this if we want to move to python-2.7. I think this problem is probably the biggest issue in the way of #9958 there are a few other things to look at but that's the only one producing a segfault.

comment:6 Changed 9 years ago by gagern

I notice that PyTypeObject has a member tp_del which is not included in the documentation. So perhaps it's as simple as associating a function with that slot. Otoh, perhaps there is a reason that member isn't documented. The single reply to Python issue 4934 doesn't rule out that possibility.

comment:7 Changed 9 years ago by burcin

  • Cc burcin added

Changed 9 years ago by gagern

Workaround patch

Changed 9 years ago by gagern

Workaround hg bundle

comment:8 follow-ups: Changed 8 years ago by fbissey

  • Status changed from new to needs_review

For the record this works with sage-on-gentoo with python-2.7.1 so as far as I am concerned it works as advertised so far. I can do a review against a vanilla sage using python-2.6 later this week provided that I get the all clear to go back to work (we got a couple more earthquakes down here - the magnitude 6 one was scary but very few people got hurt).

What is the "workaround hg bundle" Martin? I can reformat things if needed.

comment:9 in reply to: ↑ 8 Changed 8 years ago by gagern

Replying to fbissey:

What is the "workaround hg bundle" Martin? I can reformat things if needed.

It's what I perceived to be the hg equivalent of a "bzr send": a file which describes a number of hg commits inclusing metadata, so that other hg users can pull from it. See hg manual for unbundle and related documentation.

comment:10 Changed 8 years ago by fbissey

Hi Martin, Steve sent me a note about it earlier this morning. The patch is the only thing necessary here, am I right?

comment:11 in reply to: ↑ 8 Changed 8 years ago by strogdon

Replying to fbissey:

For the record this works with sage-on-gentoo with python-2.7.1 so as far as I am concerned it works as advertised so far. I can do a review against a vanilla sage using python-2.6 later this week provided that I get the all clear to go back to work (we got a couple more earthquakes down here - the magnitude 6 one was scary but very few people got hurt). What is the "workaround hg bundle" Martin? I can reformat things if needed.

This patch works fine for me too on sage-on-gentoo with python-2.7.1. I have applied this patch to the vanilla sage-4.7.1.alpha2 (python-2.6.4.p10) spkg and all long tests pass. I have also applied the patch along with the patches and required spkgs in ticket #9958 to the sage-4.7.1.alpha2 (python-2.7.1) spkg and all the tests that failed without this patch because of segfaults, i.e.

sage -t  -force_lib "devel/sage/sage/schemes/generic/scheme.py"

sage -t  -force_lib "devel/sage/sage/rings/morphism.pyx"    

sage -t  -force_lib "devel/sage/sage/rings/homset.py"

now pass.

comment:12 Changed 8 years ago by fbissey

  • Description modified (diff)
  • Reviewers set to François Bissey, Steven Trogdon
  • Status changed from needs_review to positive_review

I am giving this a positive review on my and Steve's observations. I unfortunately won't be able to run some extra tests before Monday it seems.

comment:13 follow-up: Changed 8 years ago by vbraun

  • Status changed from positive_review to needs_work

As it is, the patch is spectacularly ugly. I agree that it works, but we should investigate alternatives. It is not a particularly pressing issue since Sage is not going to switch to Python-2.7 this week. I'll have a closer look at it with Burcin duing this week at SD31.

comment:14 Changed 8 years ago by fbissey

I am ok with this. I don't plan to push python-2.7.1 in 4.7.1. I would like most of the other pending bits and pieces that will make testing easier to be in 4.7.1 however. It would be nice if at least some testing could be done in 4.7.2 and really land it by the next release.

comment:15 in reply to: ↑ 13 Changed 8 years ago by gagern

Replying to vbraun:

As it is, the patch is spectacularly ugly. [...] we should investigate alternatives.

I agree, that's why I titled the patch "workaround" instead of "solution". I can't think of a way to address this in sage alone, though, as in my opinion a proper solution would entail modifications to cython as discussed above.

comment:16 follow-ups: Changed 8 years ago by vbraun

  • Authors set to Volker Braun, Martin von Gagern
  • Description modified (diff)
  • Keywords sd31 added
  • Status changed from needs_work to needs_review

In the case at hand we don't really need the Cython class GroebnerStrategy._parent, only the C struct GroebnerStrategy._parent._ring is used in __deallocate__(). I've made a minimal patch that adds a cdef ring *_parent_ring data member to GroebnerStrategy to keep track of the ring only. Because it refers to a C struct, it is safe to access it the Cython destructor. Because parents are immutable we don't have to worry about the ring * pointer becoming invalid.

The patch should fix the crash, but I don't have a Sage-on-Python-2.7 install to test it. Can somebody give it a try?

Changed 8 years ago by vbraun

Initial patch

comment:17 in reply to: ↑ 16 Changed 8 years ago by gagern

Replying to vbraun:

I had considered keeping only that C pointer, but decided against it.

Because parents are immutable we don't have to worry about the ring * pointer becoming invalid.

Can you explain this in more detail? Why does the parent object being immutable help in keeping the pointer valid? Do you mean that the parent cython object doesn't change it's ring member? That is correct, but not enough to make the approach safe: if you don't keep a reference to the parent python object around, python might decide to garbage-collect that first, leading to a call to the singula function rDelete. I guess that function at least frees the memory, so if you are unlucky, some thread will overwrite the memory that ring points to with completely different data. Of course, in most cases you'd be lucky, so you wouldn't notice the problem. But unless singular does some reference counting of its own in such a way that a rDelete call on the parent won't have immediate effect, your approach isn't safe.

comment:18 in reply to: ↑ 16 Changed 8 years ago by strogdon

Replying to vbraun:

The patch should fix the crash, but I don't have a Sage-on-Python-2.7 install to test it. Can somebody give it a try?

Here, the new patch doesn't solve the segfault problems when applied to sage-on-gentoo. In testing vanilla Sage with python-2.7.1 I've been patching the Sage spkg and then building. So my sage-main isn't a pristine sage-4.7.1.alpha2 with python-2.7.1. However the last patch that I've applied is the bug11339a patch. So, I cloned Sage to the patchset just prior to this last patch, apply the new patch and issued 'sage -ba'. The 3 tests listed #comment:11 all segfault. Let me know if you think my results are "overly" tainted.

comment:19 in reply to: ↑ 16 Changed 8 years ago by fbissey

Replying to vbraun:

In the case at hand we don't really need the Cython class GroebnerStrategy._parent, only the C struct GroebnerStrategy._parent._ring is used in __deallocate__(). I've made a minimal patch that adds a cdef ring *_parent_ring data member to GroebnerStrategy to keep track of the ring only. Because it refers to a C struct, it is safe to access it the Cython destructor. Because parents are immutable we don't have to worry about the ring * pointer becoming invalid.

The patch should fix the crash, but I don't have a Sage-on-Python-2.7 install to test it. Can somebody give it a try?

Does it depend on other patches that are post 4.7.1.alpha2? The feedback so far is that it doesn't work but I noticed there was some work in singular and rings in alpha3 and alpha4. Would it rely on any of these fixes?

comment:20 Changed 8 years ago by fbissey

hum, after looking at the tickets there are not so many that could have an influence (#11218 and #11186 and they would be long shot). Must have been stuff that I have seen on trac but not yet merged.

comment:21 follow-up: Changed 8 years ago by vbraun

  • Status changed from needs_review to needs_work

By immutable I meant that the ring of the Parent object can't change. But I wasn't aware that the parent is part of the cycle that the garbage collector is trying to break, then it won't work of course.

Actually, if the cycle is Parent <-> GroebnerStrategy? then I don't understand Martin's orginal patch. By manually increasing the refcount of Parent by one, the garbage collector should conclude that an external reference is keeping the cycle alive, right? Hence __dealloc__ would never be run. This fixes the crash, of course, but at the cost of leaking memory?

I'm leaning towards refcounting the libsingular rings in C. This would be rather easy to implement, I think.

Also, Burcin told me that the need to explicitly reset the ring after the groebner strategy is going to go away in the future. But I would wager that Sage is going to transition to Python 3 first ;-)

comment:22 in reply to: ↑ 21 Changed 8 years ago by gagern

Replying to vbraun:

But I wasn't aware that the parent is part of the cycle that the garbage collector is trying to break, then it won't work of course.

Actually, if the cycle is Parent <-> GroebnerStrategy? then I don't understand Martin's orginal patch.

I'm not sure there really is a cycle. I had interpreted the situation in such a way that tp_clear will be called for unreachable objects just in case there might be some kind of cycle between them. So there might be no actual cycle involved at all, or it might be with some other member object. One requirement is that the function does cut all cycles, which should work as the parent shouldn't be referring back to the Gröbner strategy. So by keeping the reference in one direction, we satisfy tp_clear and still manage to finalize things. As a bonus, we ensure their finalization in the correct order.

But I might be completely off target with my above understanding. Someone with intricate knowledge of python, its evolution, as well as cython, might know better.

Until then, someone could add breakpoints to see if the finalizer is executed with my patch in place. Don't have the time just now, but might do it myself in a week or so if you remind me.

comment:23 Changed 8 years ago by vbraun

The Python docs say "The tp_clear member function is used to break reference cycles in cyclic garbage detected by the garbage collector.", so there must have been a cycle involving the GroebnerStrategy instance.

I looked a bit at the source and we definitely have cycles ideal<->GroebnerStrategy. Both refer to the parent ring, and this explains why tp_clear was called on GroebnerStrategy. I haven't found a place where the parent refers back to the cycle. In this situation Martin's patch should work as we can get rid of the cycle while the parent has positive refcount. It could be that Python 2.7 got smarter and deletes the parent together with the cycle while Python 2.6 first deleted the cycle and only then noticed that nothing else refers to the parent any more. This would explain why Martin's patch works but my attempt of blindly accessing the ring C structure fails on Python 2.7.

There is a lot of stuff that could conceivably create a larger cycle involving the parent even though I haven't found one. For one, the parent keeps a reference to the "one" element. Even then, I think it is too dangerous to make the hidden assumption that there are no complete cycles involving the parents. Somebody is bound to trip over this sooner or later.

comment:24 Changed 8 years ago by vbraun

We got this crash in the parent destructor, but look whats in the element destructor:

    def __dealloc__(self):
        # TODO: Warn otherwise!
        # for some mysterious reason, various things may be NULL in some cases
        if self._parent is not <ParentWithBase>None and (<MPolynomialRing_libsingular>self._parent)._ring != NULL and self._poly != NULL:
            p_Delete(&self._poly, (<MPolynomialRing_libsingular>self._parent)._ring)

comment:25 follow-up: Changed 8 years ago by vbraun

  • Cc malb SimonKing added
  • Component changed from porting to algebra
  • Description modified (diff)
  • Status changed from needs_work to needs_review

Ok this should fix it now for good. The first patch introduces refcounting for Singular ring pointers and uses it in GroebnerStrategy, and the second updates MPolynomialRing_libsingular and MPolynomial_libsingular. Please give it a try with Python-2.7...

comment:26 Changed 8 years ago by fbissey

I have boldly pushed it to the sage-on-gentoo tree and can report that it works fine with python-2.7.1 there on 4.7.1.alpha2. It solved the problem we had but another doctest crashed that looked completely unrelated until I looked at the backtrace:

sage -t -long  -force_lib devel/sage-main/sage/crypto/mq/sr.py # Killed/crashed

and here is it says when I add -verbose

Trying:
    from sage.crypto.mq.sr import test_consistency###line 3336:_sage_    >>> from sage.crypto.mq.sr import test_consistency
Expecting nothing
ok
Trying:
    test_consistency(Integer(1))  # long time (80s on sage.math, 2011)###line 3337:_sage_    >>> test_consistency(1)  # long time (80s on sage.math, 2011)
Expecting:
    True
/usr/lib64/libcsage.so(print_backtrace+0x24)[0x7f5259023554]
/usr/lib64/libcsage.so(sigdie+0x1d)[0x7f52590235ed]
/usr/lib64/libcsage.so(sage_signal_handler+0x131)[0x7f5259023761]
/lib64/libpthread.so.0(+0xfee0)[0x7f525ca80ee0]
/usr/lib64/libsingular.so.3(_Z7na_CopyP7snumberP9sip_sring+0x79)[0x7f523cc17d47]
/usr/lib64/libsingular.so.3(_Z6pSubstP8spolyreciS0_+0x535)[0x7f523cc41734]
/usr/lib64/python2.7/site-packages/sage/rings/polynomial/multi_polynomial_libsingular.so(+0x548e9)[0x7f523c5de8e9]
/usr/lib64/libpython2.7.so.1.0(PyCFunction_Call+0x76)[0x7f525cd0e1ab]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x53b6)[0x7f525cd68414]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x3b)[0x7f525cd69c23]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2597)[0x7f525cd655f5]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(+0x6ca97)[0x7f525ccfba97]
/usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x75)[0x7f525ccd9539]
/usr/lib64/libpython2.7.so.1.0(+0x552a9)[0x7f525cce42a9]
/usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x75)[0x7f525ccd9539]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4ea0)[0x7f525cd67efe]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(+0x6ca97)[0x7f525ccfba97]
/usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x75)[0x7f525ccd9539]
/usr/lib64/libpython2.7.so.1.0(+0x552a9)[0x7f525cce42a9]
/usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x75)[0x7f525ccd9539]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4ea0)[0x7f525cd67efe]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(+0x6ca97)[0x7f525ccfba97]
/usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x75)[0x7f525ccd9539]
/usr/lib64/libpython2.7.so.1.0(+0x552a9)[0x7f525cce42a9]
/usr/lib64/libpython2.7.so.1.0(PyObject_Call+0x75)[0x7f525ccd9539]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4ea0)[0x7f525cd67efe]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4d3a)[0x7f525cd67d98]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85f)[0x7f525cd69b65]
/usr/lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x3b)[0x7f525cd69c23]
/usr/lib64/libpython2.7.so.1.0(+0xf3a14)[0x7f525cd82a14]
/usr/lib64/libpython2.7.so.1.0(PyRun_FileExFlags+0xb1)[0x7f525cd83719]
/usr/lib64/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0x33c)[0x7f525cd8421a]
/usr/lib64/libpython2.7.so.1.0(PyRun_AnyFileExFlags+0x6e)[0x7f525cd84ac1]
/usr/lib64/libpython2.7.so.1.0(Py_Main+0xb92)[0x7f525cd9424a]
/usr/bin/python2.7(main+0x7f)[0x4009e3]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f525c70acdd]
/usr/bin/python2.7[0x4008a9]

------------------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------

It should be double checked with a vanilla install in case it just reveals a problem with my libsingular ebuild.

comment:27 follow-up: Changed 8 years ago by vbraun

  • Status changed from needs_review to needs_info

I had similar problems, but I thought I fixed them and it did pass tests on my machine. In my experience, they always go away if we leak the ring (which is essentially what we did before) instead of cleaning up after we are done with it. I have a suspicion that, although almost all take a ring as argument, some Singular functions require currRing to be set. This is why I added the inlined _check_ring() function to each MPolynomial_libsingular method to make sure that we have the right currRing.

I've added a superfluous p_Normalize call to MPolynomial_libsingular.__dealloc__, this is likely the point where you are crashing. You'll get a more useful backtrace if you paste the offending doctest into sage -gdb and then use "bt".

Maybe somebody who is more familiar with the Singular api can chime in and explain precisely where you have to call rChangeCurrRing and where you don't need to.

comment:28 Changed 8 years ago by fbissey

It is one of these annoying one which doesn't seem to happen when you run under gdb. May be I'll have to try a bigger portion of the sequence of commands tested.

comment:29 in reply to: ↑ 25 Changed 8 years ago by strogdon

Replying to vbraun:

Ok this should fix it now for good. The first patch introduces refcounting for Singular ring pointers and uses it in GroebnerStrategy, and the second updates MPolynomialRing_libsingular and MPolynomial_libsingular. Please give it a try with Python-2.7...

OK, I applied both the singular_rings and singular_polynomials patches to the 4.7.1.alpha2 that I have and built Sage as in comment:18 . As noted above this is not a pristine 4.7.1.alpha2, but has other python-2.7 related patches included; tickets #11377, #11244, #9958, #11376 and #10764. There are no segfaults. The testsuite looks like the results when the bug11339a.patch was applied. And I don't get the Killed doctest mentioned by François. Now with just the singular_rings patch I also get no segfaults or crashes. Should I get a crash? I can't see that any of the applied patches could affect things. I will build from scratch, just to make sure.

comment:30 Changed 8 years ago by fbissey

OK so a proper backtrace. Note that the failing test is marked "long".

Program received signal SIGSEGV, Segmentation fault.
0x00007fffd79c2d47 in p_Copy (p=0x1, r=0x7ffff445fc38) at pInline2.h:441
441     pInline2.h: No such file or directory.
        in pInline2.h
(gdb) bt
#0  0x00007fffd79c2d47 in p_Copy (p=0x1, r=0x7ffff445fc38) at pInline2.h:441
#1  na_Copy (p=0x1, r=0x7ffff445fc38) at longalg.cc:1005
#2  0x00007fffd79ec734 in p_Head (p=<value optimized out>, n=<value optimized out>, e=0x7fffd0d832d0) at pInline1.h:152
#3  pSubst (p=<value optimized out>, n=<value optimized out>, e=0x7fffd0d832d0) at polys.cc:970
#4  0x00007fffd7387998 in __pyx_pf_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular_33subs (
    __pyx_v_self=0x4d1b370, __pyx_args=<value optimized out>, __pyx_kwds=<value optimized out>)
    at sage/rings/polynomial/multi_polynomial_libsingular.cpp:19922
#5  0x00007ffff7aac1ab in PyCFunction_Call (func=0x4ce4908, arg=0x4ea41d0, kw=<value optimized out>) at Objects/methodobject.c:85
#6  0x00007ffff7b06414 in ext_do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4322
#7  PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2704
#8  0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x1798b30, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=2, kws=0x5244750, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#9  0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#10 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#11 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#12 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x3cc92b0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=1, kws=0x523abe8, kwcount=0, defs=0x3ccb6a8, defcount=1, closure=0x0)
    at Python/ceval.c:3252
#13 0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#14 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#15 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#16 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4e985b0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#17 0x00007ffff7b07c23 in PyEval_EvalCode (co=<value optimized out>, globals=<value optimized out>, locals=<value optimized out>)
    at Python/ceval.c:666
#18 0x00007ffff7b035f5 in exec_statement (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4709
#19 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:1879
#20 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4bddc30, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=5, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#21 0x00007ffff7a99a97 in function_call (func=0x4bafc08, arg=0x552fdd0, kw=0x0) at Objects/funcobject.c:526
#22 0x00007ffff7a77539 in PyObject_Call (func=0x4bafc08, arg=0x552fdd0, kw=0x0) at Objects/abstract.c:2529
#23 0x00007ffff7a822a9 in instancemethod_call (func=0x4bafc08, arg=0x552fdd0, kw=0x0) at Objects/classobject.c:2578
#24 0x00007ffff7a77539 in PyObject_Call (func=0x5450410, arg=0x552fdd0, kw=0x0) at Objects/abstract.c:2529
#25 0x00007ffff7b05efe in do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4230
#26 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4035
#27 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#28 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4c38c30, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=5, kws=0x52444f8, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#29 0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#30 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#31 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#32 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4bddd30, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=4, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#33 0x00007ffff7a99a97 in function_call (func=0x4bafc80, arg=0x5522050, kw=0x0) at Objects/funcobject.c:526
#34 0x00007ffff7a77539 in PyObject_Call (func=0x4bafc80, arg=0x5522050, kw=0x0) at Objects/abstract.c:2529
#35 0x00007ffff7a822a9 in instancemethod_call (func=0x4bafc80, arg=0x5522050, kw=0x0) at Objects/classobject.c:2578
#36 0x00007ffff7a77539 in PyObject_Call (func=0x5453500, arg=0x5522050, kw=0x0) at Objects/abstract.c:2529
#37 0x00007ffff7b05efe in do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4230
#38 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4035
#39 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#40 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x7ffff289de30, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=4, kws=0x5181330, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#41 0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#42 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#43 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#44 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4bddeb0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=2, kws=0x4c3b618, kwcount=3, defs=0x4bae108, defcount=3, closure=0x0)
    at Python/ceval.c:3252
#45 0x00007ffff7a99a97 in function_call (func=0x4bafde8, arg=0x4b2e680, kw=0x473aa50) at Objects/funcobject.c:526
#46 0x00007ffff7a77539 in PyObject_Call (func=0x4bafde8, arg=0x4b2e680, kw=0x473aa50) at Objects/abstract.c:2529
#47 0x00007ffff7a822a9 in instancemethod_call (func=0x4bafde8, arg=0x4b2e680, kw=0x473aa50) at Objects/classobject.c:2578
#48 0x00007ffff7a77539 in PyObject_Call (func=0x37d84b0, arg=0x4b2e680, kw=0x473aa50) at Objects/abstract.c:2529
#49 0x00007ffff7b05efe in do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4230
#50 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4035
#51 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#52 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x7ffff7ec41b0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=2, kws=0x4c68d60, kwcount=0, defs=0x49f56a8, defcount=3, closure=0x0)
    at Python/ceval.c:3252
#53 0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#54 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#55 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#56 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4bd88b0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=0, kws=0x46212b0, kwcount=10, defs=0x4bb8288, defcount=10, closure=0x0)
    at Python/ceval.c:3252
#57 0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#58 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#59 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#60 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x4c38ab0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=1, kws=0x6cae88, kwcount=3, defs=0x4bb8068, defcount=10, closure=0x0)
    at Python/ceval.c:3252
#61 0x00007ffff7b05d98 in fast_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4108
#62 call_function (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4033
#63 PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:2665
#64 0x00007ffff7b07b65 in PyEval_EvalCodeEx (co=0x7ffff7eb7eb0, globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#65 0x00007ffff7b07c23 in PyEval_EvalCode (co=<value optimized out>, globals=<value optimized out>, locals=<value optimized out>)
    at Python/ceval.c:666
#66 0x00007ffff7b20a14 in run_mod (mod=<value optimized out>, filename=<value optimized out>, globals=0x640f60, locals=0x640f60, 
    flags=<value optimized out>, arena=<value optimized out>) at Python/pythonrun.c:1346
#67 0x00007ffff7b21719 in PyRun_FileExFlags (fp=0x6a0530, filename=0x7fffffffddc1 "/home/fbissey/.sage/tmp/.doctest_sr.py", 
    start=257, globals=0x640f60, locals=0x640f60, closeit=1, flags=0x7fffffffd870) at Python/pythonrun.c:1332
#68 0x00007ffff7b2221a in PyRun_SimpleFileExFlags (fp=0x6a0530, filename=<value optimized out>, closeit=1, flags=0x7fffffffd870)
    at Python/pythonrun.c:936
#69 0x00007ffff7b22ac1 in PyRun_AnyFileExFlags (fp=0x6a0530, filename=0x7fffffffddc1 "/home/fbissey/.sage/tmp/.doctest_sr.py", 
    closeit=1, flags=0x7fffffffd870) at Python/pythonrun.c:740
#70 0x00007ffff7b3224a in Py_Main (argc=<value optimized out>, argv=0x7fffffffd9b8) at Modules/main.c:606
#71 0x00000000004009e3 in main (argc=2, argv=0x7fffffffd9b8) at ./Modules/python.c:46

I didn't realize you could do something like

sage -t -long  -force_lib devel/sage-main/sage/crypto/mq/sr.py -gdb

handy...

comment:31 Changed 8 years ago by fbissey

Since we can even add "-verbose" I should add this bit of output:

Trying:
    test_consistency(Integer(1))  # long time (80s on sage.math, 2011)###line 3337:_sage_    >>> test_consistency(1)  # long time (80s on sage.math, 2011)
Expecting:
    True

Program received signal SIGSEGV, Segmentation fault.
0x00007fffd79c2d47 in p_Copy (p=0x1, r=0x7ffff445fc38) at pInline2.h:441
441     pInline2.h: No such file or directory.
        in pInline2.h

So it looks like the segfault is after the test has completed, probably when sage is quitting as happened to Steve when he tried to run the test while disabling the garbage collection (before we had any patches what so ever).

comment:32 Changed 8 years ago by strogdon

I've just completed the testsuite after building vanilla sage-4.7.1.alpha2 from scratch. The patches from tickets #11377, #11244, #9958, #11376 and #10764 along with the patches of this ticket were used to create a patched sage-4.7.1.alpha2 spkg before the build. The results are identical to those reported earlier. There are no segfaults and no Killed/crashed tests.

On my sage-on-gentoo install on x86 in prefix there no segfaults; however, I have one Killed/crashed doctest which I didn't have when using the bug11339a patch.

sage -t -long -force_lib "devel/sage-main/sage/libs/singular/option.pyx" -gdb

Program received signal SIGSEGV, Segmentation fault.
0xb3b16d29 in _nlDelete_NoImm(snumber**) () from /storage/strogdon/gentoo/usr/lib/libsingular.so.3
(gdb) bt
#0  0xb3b16d29 in _nlDelete_NoImm(snumber**) () from /storage/strogdon/gentoo/usr/lib/libsingular.so.3
#1  0xb3bd831d in p_Delete__FieldQ_LengthGeneral_OrdGeneral () from /storage/strogdon/gentoo/usr/lib/libsingular.so.3
#2  0xb3acde01 in id_Delete(sip_sideal**, sip_sring*) () from /storage/strogdon/gentoo/usr/lib/libsingular.so.3
#3  0xb3a70736 in sleftv::CleanUp(sip_sring*) () from /storage/strogdon/gentoo/usr/lib/libsingular.so.3
#4  0xb37f490e in __pyx_f_4sage_4libs_8singular_8function_free_leftv(sleftv*) () from /storage/strogdon/gentoo/usr/lib/python2.7/site-packages/sage/libs/singular/function.so
#5  0xb3805653 in __pyx_f_4sage_4libs_8singular_8function_call_function(__pyx_obj_4sage_4libs_8singular_8function_SingularFunction*, _object*, __pyx_obj_4sage_5rings_10polynomial_28multi_polynomial_libsingular_MPolynomialRing_libsingular*, __pyx_opt_args_4sage_4libs_8singular_8function_call_function*) () from /storage/strogdon/gentoo/usr/lib/python2.7/site-packages/sage/libs/singular/function.so
#6  0xb3806843 in __pyx_pf_4sage_4libs_8singular_8function_16SingularFunction_2__call__(_object*, _object*, _object*) () from /storage/strogdon/gentoo/usr/lib/python2.7/site-packages/sage/libs/singular/function.so
#7  0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#8  0xb7f3a0a4 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#9  0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#10 0xb7f3ca34 in PyEval_EvalCode () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#11 0xb7f3bbb2 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#12 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#13 0xb7ec7c58 in function_call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#14 0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#15 0xb7eafaae in instancemethod_call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#16 0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#17 0xb7f3a0a4 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#18 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#19 0xb7f3ab11 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#20 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#21 0xb7ec7c58 in function_call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#22 0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#23 0xb7eafaae in instancemethod_call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#24 0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#25 0xb7f3a0a4 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#26 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#27 0xb7f3ab11 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#28 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#29 0xb7ec7d43 in function_call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#30 0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#31 0xb7eafaae in instancemethod_call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#32 0xb7e9fffb in PyObject_Call () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#33 0xb7f3a0a4 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#34 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#35 0xb7f3ab11 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#36 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#37 0xb7f3ab11 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#38 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#39 0xb7f3ab11 in PyEval_EvalFrameEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#40 0xb7f3c8e1 in PyEval_EvalCodeEx () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#41 0xb7f3ca34 in PyEval_EvalCode () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#42 0xb7f5659c in run_mod () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#43 0xb7f57483 in PyRun_FileExFlags () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#44 0xb7f5815c in PyRun_SimpleFileExFlags () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#45 0xb7f58cf2 in PyRun_AnyFileExFlags () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#46 0xb7f6a35f in Py_Main () from /storage/strogdon/gentoo/usr/lib/libpython2.7.so.1.0
#47 0x0804888b in main ()

comment:33 in reply to: ↑ 27 ; follow-up: Changed 8 years ago by malb

Replying to vbraun:

Hi, first: thanks for cleaning up the mess that no-doubt I made.

I had similar problems, but I thought I fixed them and it did pass tests on my machine. In my experience, they always go away if we leak the ring (which is essentially what we did before) instead of cleaning up after we are done with it.

Ah, this might also explain #5949 then.

I have a suspicion that, although almost all take a ring as argument, some Singular functions require currRing to be set.

The API is as follows: pXXX does assume the currRing to be set, whereas p_XXX does not. However, sometimes there are bugs, i.e. you call p_XXX and it crashes unless currRing is set.

This is why I added the inlined _check_ring() function to each MPolynomial_libsingular method to make sure that we have the right currRing.

TBH, I don't like this blanket approach, did you run into specific problems? If so, I'd add a check to those functions instead of hiding bugs by forcing the ring.

comment:34 in reply to: ↑ 33 Changed 8 years ago by vbraun

Replying to malb:

The API is as follows: pXXX does assume the currRing to be set, whereas p_XXX does not. However, sometimes there are bugs, i.e. you call p_XXX and it crashes unless currRing is set.

I guessed that much, but a documented list of exceptions to that rule might be nice :-)

TBH, I don't like this blanket approach, did you run into specific problems? If so, I'd add a check to those functions instead of hiding bugs by forcing the ring.

In principle I agree, but its probably more productive to first produce a working version and then see if it still works if we remove the blanket rChangeCurrRing. Right now I'm not sure if that is the underlying problem. I didn't find any 100% reliable testcase but in some cases adding the extra ring switch fixed segfaults, but only to reappear in other places.

comment:35 follow-up: Changed 8 years ago by vbraun

  • Status changed from needs_info to needs_review

The updated polynomials patch fixes another place where potentially the wrong singular ring was used, which would explain the data corruption. Please give it a try!

I'm pretty sure that p_Normalize requires currRing to be set. In the polynomials destructor, I have

    ### If you suspect that the poly data was corrupted, then this is a good way to trigger a segfault:
    rChangeCurrRing(self._parent_ring)
    p_Normalize(self._poly, self._parent_ring)

This is useful for debugging because p_Normalize will touch all coefficients of the polynomial. The destructor is typically called long after the actual computations so it is very likely that the currRing has changed by then. If I reverse the order of the two lines, then I get segfaults all over the Sage library. However, because it relies on the finalization order, they are usually in varying places and often disappear if you just paste the doctest into Sage.

comment:36 Changed 8 years ago by fbissey

It seems to have taken care of my problem. Thanks Volker. Now we should wait to see if it did something for Steve.

comment:37 in reply to: ↑ 35 Changed 8 years ago by strogdon

Replying to vbraun:

The updated polynomials patch fixes another place where potentially the wrong singular ring was used, which would explain the data corruption. Please give it a try!

My vanilla sage-4.7.1.alpha2 has no segfaults or Killed/crashed tests with the new patch. The patch seems to cure the doctest failure (option.pyx) I had on my sage-on-gentoo install on x86. However on amd64 (sage-on-gentoo) I have a Segmentation fault with the doctest

sage -t -long  -force_lib devel/sage-main/sage/schemes/elliptic_curves/ell_curve_isogeny.py

I'll attach files of the Segmentation fault and debug/backtrace.

Changed 8 years ago by strogdon

Segmentation fault associated with

Changed 8 years ago by strogdon

ell_curve_isogeny.py debug/backtrace

comment:38 Changed 8 years ago by leif

  • Cc leif added

comment:39 Changed 8 years ago by vbraun

  • Status changed from needs_review to needs_info

I built a sage spkg with Singular's polynomials debugging turned on (PDEBUG=2) and now Singular spits out tons of memory allocation errors during Sage startup (turns out Sage does some high-degree polynomial computations during startup). Can somebody who understands Singular better comment on whether those are real errors?

http://www.stp.dias.ie/~vbraun/Sage/spkg/singularDEBUG-3-1-1-4.p9.spkg

Note that its not as easy as turning on SAGE_DEBUG; the Singular spkg-install does set a configure option in that case but its not doing anything. I also had to fix up an instance where an #ifdef PDEBUG replaced a libSingular function by a macro.

comment:41 Changed 8 years ago by vbraun

The omalloc error gets triggered by fraction field conversions. If you take out some stuff from ell_curve_isogeny.py then you can start up Sage just fine. I'll attach a patch for debugging purposes.

The omalloc errors appear with or without my patches on this ticket, by the way.

Changed 8 years ago by vbraun

Remove some initialization form ell_curve_isogeny.py for debugging.

comment:42 Changed 8 years ago by fbissey

Just for the record while I am working on stuff. It seems there are no problems on amd64 all the problems seem to arise on the x86 side. So I upgraded to 4.7.1.alpha3 and ran the long tests on my 2003 x86 box. 16 hours later I have three doctests crashes that may be related (currently setting proper debugging on that box) and i seem to have lost the test.log :P - just great. Anyway I'll have a look at the extra debugging in libsingular. If there is a problem with omalloc could we switch to the system malloc? I think there is an option for that (whether it is working or not is another matter).

comment:43 Changed 8 years ago by fbissey

One was: sage/modular/modform/element.py, now that I have started to include debugging info it seems to behave, may be it was the load.

comment:44 follow-up: Changed 8 years ago by fbissey

OK back to singular problems. On my OS X (10.5.8 32bits) box I got another test crash:

sage -t -long  -force_lib devel/sage-main/sage/rings/polynomial/toy_buchberger.py

and the following backtrace

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x0000001a
0x062ad4e5 in nlNormalize ()
(gdb) bt
#0  0x062ad4e5 in nlNormalize ()
#1  0x062ad78b in nlInt ()
#2  0x06cdcca6 in __pyx_f_4sage_4libs_8singular_8singular_si2sa ()
#3  0x06bb7f07 in __pyx_f_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular__hash_c ()
#4  0x06b5c263 in __pyx_pf_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular_6__hash__ ()
#5  0x00203ee4 in set_add_key ()
#6  0x00203f39 in set_add ()
#7  0x00255e2b in PyEval_EvalFrameEx ()

This is with 4.7.1.alpha4. I'll try again on the linux x86 box this weekend (and not lose the logs this time). It may be the last test I run on that box for a while.

comment:45 in reply to: ↑ 44 Changed 8 years ago by strogdon

Replying to fbissey:

OK back to singular problems. On my OS X (10.5.8 32bits) box I got another test crash: sage -t -long -force_lib devel/sage-main/sage/rings/polynomial/toy_buchberger.py

I get the same failure on amd64 (vanilla sage-4.7.1.alpha4) but with a slightly different backtrace. Here is the beginning of the backtrace:

Program received signal SIGSEGV, Segmentation fault.
nlNormalize (x=@0x7fffffffac68) at longrat.cc:1221
1221    longrat.cc: No such file or directory.
        in longrat.cc
(gdb) bt
#0  nlNormalize (x=@0x7fffffffac68) at longrat.cc:1221
#1  0x00007fffdb2caafe in nlInt (i=@0x7fffffffac68, r=<value optimized out>)
    at longrat.cc:494
#2  0x00007fffda13f4f1 in __pyx_f_4sage_4libs_8singular_8singular_si2sa (
    __pyx_v_n=0x2, __pyx_v__ring=0x7fffdb0a3760, __pyx_v_base=0x4964000)
    at sage/libs/singular/singular.cpp:5508
#3  0x00007fffdabf216a in __pyx_f_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular__hash_c (__pyx_v_self=<value optimized out>)
    at sage/rings/polynomial/multi_polynomial_libsingular.cpp:18417
#4  0x00007fffdabdc66b in __pyx_pf_4sage_5rings_10polynomial_28multi_polynomial_libsingular_23MPolynomial_libsingular_6__hash__ (__pyx_v_self=<value optimized out>)
    at sage/rings/polynomial/multi_polynomial_libsingular.cpp:14965
#5  0x00007ffff7a8dde8 in set_add_key (so=0x483b138, key=0x495b9b0)
    at Objects/setobject.c:386
#6  0x00007ffff7a8de39 in set_add (so=<value optimized out>, 
    key=<value optimized out>) at Objects/setobject.c:1837
#7  0x00007ffff7af0fa4 in call_function (oparg=<value optimized out>, 
    pp_stack=0x7fffffffae50) at Python/ceval.c:4000
#8  PyEval_EvalFrameEx (f=<value optimized out>, throwflag=<value optimized out>)
    at Python/ceval.c:2665
#9  0x00007ffff7af1824 in fast_function (nk=<value optimized out>, 
    na=<value optimized out>, n=<value optimized out>, pp_stack=0x7fffffffafc0, 
    func=0x152da28) at Python/ceval.c:4098

I get no crashes when running the testsuite on x86.

comment:46 Changed 8 years ago by fbissey

yes you have more details. Traces with gdb on OS X is just difficult. But I don't get that one on any of my two amd64 boxes. I will try that particular test on x86.

comment:47 Changed 8 years ago by fbissey

i get the failure in sage/schemes/elliptic_curves/ell_curve_isogeny.py on my x86 box same backtrace as comment:35 - sage/modular/modform/element.py still seem to play up, it crashes when I run sage -testall but not when I run it individually. Note that I don't parallel testing on that machine. Wait a minute it crashed after I added -long.

sage -t -long -force_lib "devel/sage/sage/modular/modform/element.py"
The doctested process was killed by signal 8
         [62.7 s]
 
----------------------------------------------------------------------
The following tests failed:


        sage -t -long -force_lib "devel/sage/sage/modular/modform/element.py" # Killed/crashed
Total time for all tests: 62.7 seconds

Unfortunately it disappear again when in gdb:

francois@vrooom ~ $ sage -t -long -force_lib "devel/sage/sage/modular/modform/element.py" -gdb
sage -t -long -force_lib -gdb "devel/sage/sage/modular/modform/element.py"
********************************************************************************
Type r at the (gdb) prompt to run the doctests.
Type bt if there is a crash to see a traceback.
********************************************************************************
GNU gdb (Gentoo 7.2 p1) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/python /home/francois/.sage/tmp/.doctest_element.py
process 27984 is executing new program: /usr/bin/python2.7
[Thread debugging using libthread_db enabled]
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/gcc/i686-pc-linux-gnu/4.5.2/libstdc++.so.6.0.14-gdb.py", line 59, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named libstdcxx.v6.printers

Program exited normally.

comment:48 Changed 8 years ago by fbissey

Disregard sage/modular/modform/element.py. Our old friend ATLAS-3.9.xx is responsible for that one. With -verbose:

Trying:
    f = ModularForms(Gamma1(Integer(3)), Integer(7)).gen(0)###line 1016:_sage_    >>> f = ModularForms(Gamma1(3), 7).0
Expecting nothing
ok
Trying:
    f*f###line 1017:_sage_    >>> f*f
Expecting:
    q^2 - 54*q^4 + 128*q^5 + O(q^6)
/usr/lib/libcsage.so(print_backtrace+0x40)[0xb6e35acf]
/usr/lib/libcsage.so(sigdie+0x22)[0xb6e35b92]
/usr/lib/libcsage.so(sage_signal_handler+0x1a3)[0xb6e35d84]
[0xb77cc400]
/usr/lib/libgmp.so.3(__gmpz_set_d+0x49)[0xb689a2b9]
/usr/lib/libiml.so.0(ChineseRemainder+0x1af)[0xb3055b1f]

------------------------------------------------------------------------
Unhandled SIGFPE: An unhandled floating point exception occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
The doctested process was killed by signal 8

comment:49 follow-up: Changed 8 years ago by vbraun

Something is wrong with atlas-3.9.x and linbox.

I built libSingular with debugging turned on, but that shows lots and lots of errors. I'm not sure if all of them are actual bugs, but I think its likely that they are the origin of the segfaults. See https://groups.google.com/d/msg/libsingular-devel/cfzQEgjxB8o/dvb2Efc-7pIJ

comment:50 in reply to: ↑ 49 Changed 8 years ago by fbissey

Replying to vbraun:

Something is wrong with atlas-3.9.x and linbox.

I know about that. I have known about a similar problem with iml for longer https://github.com/cschwan/sage-on-gentoo/issues/3. Both iml and linbox fail after calling a routine called (or including the words) ChineseRemainder? which uses results from calls to cblas_dgemv and cblas_dgemm but this comment really belongs to #10508.

I built libSingular with debugging turned on, but that shows lots and lots of errors. I'm not sure if all of them are actual bugs, but I think its likely that they are the origin of the segfaults. See https://groups.google.com/d/msg/libsingular-devel/cfzQEgjxB8o/dvb2Efc-7pIJ

I know this is a slow time of the year in the northern emisphere (starting semester two tomorrow down under) but nothing since June 27th? May be the thread should be bumped or something.

comment:51 Changed 8 years ago by fbissey

Still got the failure schemes/elliptic_curves/ell_curve_isogeny.py on amd64 with 4.7.2_alpha0. That's the only killed doctest so far on this set up.

comment:52 follow-up: Changed 8 years ago by strogdon

Here's an update since 4.7.2.alpha1 has been released. refcount_singular_polynomials.patch applies just fine but there is fuzz with refcount_singular_rings.patch

applying /storage/sage/patches/trac_11339_refcount_singular_rings.patch

patching file sage/libs/singular/ring.pyx

Hunk #1 succeeded at 57 with fuzz 2 (offset 5 lines).

I get the following failures (killed/crashed):

x86:

sage -t -long -force_lib devel/sage-main/sage/crypto/mq/mpolynomialsystem.py

amd64:

sage -t -long -force_lib devel/sage-main/sage/schemes/elliptic_curves/ell_number_field.py

sage -t -long -force_lib devel/sage-main/sage/crypto/mq/mpolynomialsystem.py

sage -t -long -force_lib devel/sage-main/sage/crypto/mq/sr.py

comment:53 in reply to: ↑ 52 ; follow-up: Changed 8 years ago by fbissey

Replying to strogdon:

Here's an update since 4.7.2.alpha1 has been released. refcount_singular_polynomials.patch applies just fine but there is fuzz with refcount_singular_rings.patch

applying /storage/sage/patches/trac_11339_refcount_singular_rings.patch

patching file sage/libs/singular/ring.pyx

Hunk #1 succeeded at 57 with fuzz 2 (offset 5 lines).

I get the following failures (killed/crashed):

x86:

sage -t -long -force_lib devel/sage-main/sage/crypto/mq/mpolynomialsystem.py

amd64:

sage -t -long -force_lib devel/sage-main/sage/schemes/elliptic_curves/ell_number_field.py

sage -t -long -force_lib devel/sage-main/sage/crypto/mq/mpolynomialsystem.py

sage -t -long -force_lib devel/sage-main/sage/crypto/mq/sr.py

Vanilla or sage-on-gentoo (sog) variety? So far for sog I still only have the sage/schemes/elliptic_curves/ell_curve_isogeny.py crash but nothing else on amd64.

My linux-x86 box is "decommissioned", I'll look on x86 OS X next.

Haven't had time to try vanilla.

comment:54 in reply to: ↑ 53 Changed 8 years ago by strogdon

Replying to fbissey:

Replying to strogdon:

Here's an update since 4.7.2.alpha1 has been released. refcount_singular_polynomials.patch applies just fine but there is fuzz with refcount_singular_rings.patch applying /storage/sage/patches/trac_11339_refcount_singular_rings.patch patching file sage/libs/singular/ring.pyx Hunk #1 succeeded at 57 with fuzz 2 (offset 5 lines). I get the following failures (killed/crashed): x86: sage -t -long -force_lib devel/sage-main/sage/crypto/mq/mpolynomialsystem.py amd64: sage -t -long -force_lib devel/sage-main/sage/schemes/elliptic_curves/ell_number_field.py sage -t -long -force_lib devel/sage-main/sage/crypto/mq/mpolynomialsystem.py sage -t -long -force_lib devel/sage-main/sage/crypto/mq/sr.py

Vanilla or sage-on-gentoo (sog) variety? So far for sog I still only have the sage/schemes/elliptic_curves/ell_curve_isogeny.py crash but nothing else on amd64. My linux-x86 box is "decommissioned", I'll look on x86 OS X next. Haven't had time to try vanilla.

This is for vanilla Sage. I've just started on sage-on-gentoo builds.

comment:55 Changed 8 years ago by fbissey

trac_11339_refcount_singular_polynomials.patch doesn't apply on top of 4.7.2.alpha2, I believe the conflict is with #7654 but I could be wrong.

comment:56 follow-up: Changed 8 years ago by leif

  • Status changed from needs_info to needs_review
  • Work issues set to Rebase on Sage 4.7.2.alpha2.
leif@californication:~/Sage/sage-4.7.2.alpha2-gcc-4.5.1/devel/sage-9958-11339$ ../../sage -hg import -v ~/Sage/patches/trac_11339_refcount_singular_polynomials.patch 
applying /home/leif/Sage/patches/trac_11339_refcount_singular_polynomials.patch
patching file sage/rings/polynomial/multi_polynomial_libsingular.pxd
patching file sage/rings/polynomial/multi_polynomial_libsingular.pyx
Hunk #13 FAILED at 727
Hunk #14 FAILED at 741
Hunk #15 FAILED at 763
Hunk #16 FAILED at 773
Hunk #17 FAILED at 825
Hunk #18 succeeded at 910 with fuzz 2 (offset 72 lines).
Hunk #19 FAILED at 875
Hunk #20 succeeded at 961 (offset 75 lines).
Hunk #21 succeeded at 980 (offset 75 lines).
Hunk #22 succeeded at 1006 (offset 75 lines).
...

(The first patch still applies, with offsets.)

I'm not going to fix that...

comment:57 Changed 8 years ago by leif

  • Status changed from needs_review to needs_work

Trac, why can't I switch from "needs info" to "needs work"?!?

comment:58 in reply to: ↑ 56 Changed 8 years ago by leif

Replying to leif:

leif@californication:~/Sage/sage-4.7.2.alpha2-gcc-4.5.1/devel/sage-9958-11339$ ../../sage -hg import -v ~/Sage/patches/trac_11339_refcount_singular_polynomials.patch 
applying /home/leif/Sage/patches/trac_11339_refcount_singular_polynomials.patch
patching file sage/rings/polynomial/multi_polynomial_libsingular.pxd
patching file sage/rings/polynomial/multi_polynomial_libsingular.pyx
Hunk #13 FAILED at 727
Hunk #14 FAILED at 741
Hunk #15 FAILED at 763
Hunk #16 FAILED at 773
Hunk #17 FAILED at 825
Hunk #18 succeeded at 910 with fuzz 2 (offset 72 lines).
Hunk #19 FAILED at 875
...

That's due to #7654. I just wonder why, according to trac, Martin attached (or updated) the patch 11 days ago, but:

leif@californication:~/Sage/sage-4.7.2.alpha2-gcc-4.5.1/devel/sage-9958-11339$ ../../sage -hg log sage/rings/polynomial/multi_polynomial_libsingular.pyx
changeset:   16029:f8df02b7396c
user:        Martin Albrecht <martinralbrecht@googlemail.com>
date:        Sat Jun 25 15:51:22 2011 +0100
summary:     Trac #7654: #7654 fix MPolynomial_libsingular.__call__ to accept more inputs 
and fail gracefully

comment:59 follow-up: Changed 8 years ago by leif

I've verified that (except for the meta data, including the dates, and one line number) the patch attached to #7654 and the one Jeroen merged into Sage 4.7.2.alpha2 are the same. Still weird...

comment:60 in reply to: ↑ 59 Changed 8 years ago by fbissey

Replying to leif:

I've verified that (except for the meta data, including the dates, and one line number) the patch attached to #7654 and the one Jeroen merged into Sage 4.7.2.alpha2 are the same. Still weird...

The line number is weird, otherwise doesn't Jeroen merge script add stuff to make sure there is proper metadata in the patch?

Note also that we are playing a "wack a mole" game here. You fix things in one place and suddenly you get a new failure in another. So refreshing Volker's patch is just the beginning....

Changed 8 years ago by vbraun

Updated patch

Changed 8 years ago by vbraun

Updated patch

comment:61 Changed 8 years ago by vbraun

I've rebased the patches on top of sage-4.7.2.alpha2. There are still a few segfaults from deleting rings, but at least we can reproduce them with the patches ;-)

comment:62 Changed 8 years ago by leif

  • Priority changed from major to critical

Something for SD34? :P

comment:63 Changed 8 years ago by strogdon

Aside from the segfaults, with the new patches I get the following additional failurers. The failures "seem" to be associated with the patches in that I don't get them when I use Martin's patches.

amd64:

sage -t -long  -force_lib devel/sage-trans_16037.16051/sage/misc/sageinspect.py
sage -t -long  -force_lib devel/sage-trans_16037.16051/sage/schemes/plane_conics/con_field.py
sage -t -long  -force_lib devel/sage-trans_16037.16051/sage/schemes/hyperelliptic_curves/hyperelliptic_finite_field.py

x86:

sage -t -long  -force_lib devel/sage-trans_16037.16051/sage/misc/sageinspect.py
sage -t -long  -force_lib devel/sage-trans_16037.16051/sage/schemes/plane_conics/con_field.py

Here is a brief summary of the failures.

sageinspect.py (partial output given):

Expected:
    (['cdef class MPolynomialRing_libsingular(MPolynomialRing_generic):\n',
      "    def __init__(self, base_ring, n, names, order='degrevlex'):\n",
    ...
      '          M.append(new_MP(self, p_Copy(tempvector,_ring)))\n',
      '          return M\n'], ...)
Got:
    (['cdef class MPolynomialRing_libsingular(MPolynomialRing_generic):\n', '\n', '    def
 __cinit__(self):\n', '

con_field.py:

File "/storage/sage/sage-4.7.2.alpha2/devel/sage-trans_16037.16051/sage/schemes/plane_conics/con_field.py", line 234:
    sage: C.is_smooth()
Expected:
    True
Got:
    False

hyperelliptic_curves/hyperelliptic_finite_field.py:

File "/storage/sage/sage-4.7.2.alpha2/devel/sage-trans_16037.16051/sage/schemes/hyperelliptic_curves/hyperelliptic_finite_field.py", line 269:
    sage: set(C._points_cache_sqrt()) == set(C._points_cache_sqrt(brute_force=True))
Exception raised:
    Traceback (most recent call last):
      File "/storage/sage/sage-4.7.2.alpha2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/storage/sage/sage-4.7.2.alpha2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/storage/sage/sage-4.7.2.alpha2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_5[5]>", line 1, in <module>
        set(C._points_cache_sqrt()) == set(C._points_cache_sqrt(brute_force=True))###line 269:
    sage: set(C._points_cache_sqrt()) == set(C._points_cache_sqrt(brute_force=True))
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/hyperelliptic_curves/hyperelliptic_finite_field.py", line 296, in _points_cache_sqrt
        points.append(self.point([x, -y, one], check=True))
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/generic/scheme.py", line 256, in point
        return self._point_class(self, v, check=check)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/generic/algebraic_scheme.py", line 528, in _point_class
        return self.__A._point_class(*args, **kwds)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/generic/projective_space.py", line 561, in _point_class
        return morphism.SchemeMorphism_projective_coordinates_field(*args, **kwds)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/generic/morphism.py", line 676, in __init__
        X.codomain()._check_satisfies_equations(v)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/generic/algebraic_scheme.py", line 902, in _check_satisfies_equations
        raise TypeError, "Coordinates %s do not define a point on %s"%(coords,self)
    TypeError: Coordinates [2, 2, 1] do not define a point on Hyperelliptic Curve over Finite Field of size 7 defined by y^2 = x^3 + x^2 + 6

comment:64 Changed 8 years ago by fbissey

I have seen the error on sageinspect.py before. Did you send it to me privately? I cannot seem to find where it was mentioned in inbox or my logs. You get cython code as output while the test is expecting c++ code to be returned.

comment:65 follow-up: Changed 8 years ago by fbissey

Still about sageinspect.py, this is stuff added in #11298, did you apply #11734 to your branch?

My own list of stuff that seem singular related is:

	sage -t -long  -force_lib devel/sage-main/sage/crypto/mq/sr.py # Killed/crashed
	sage -t -long  -force_lib devel/sage-main/sage/schemes/elliptic_curves/ell_curve_isogeny.py # Killed/crashed
	sage -t -long  -force_lib devel/sage-main/sage/schemes/elliptic_curves/ell_generic.py # Killed/crashed
	sage -t -long  -force_lib devel/sage-main/sage/rings/polynomial/multi_polynomial_ideal.py # Killed/crashed
	sage -t -long  -force_lib devel/sage-main/sage/schemes/generic/algebraic_scheme.py # Killed/crashed
        sage -t -long  -force_lib devel/sage-main/sage/schemes/plane_conics/con_field.py # 1 doctests failed

comment:66 Changed 8 years ago by fbissey

Forgot

	sage -t -long  -force_lib devel/sage-main/sage/schemes/plane_conics/con_finite_field.py # Killed/crashed

comment:67 in reply to: ↑ 65 ; follow-up: Changed 8 years ago by strogdon

Replying to fbissey:

Still about sageinspect.py, this is stuff added in #11298, did you apply #11734 to your branch? My own list of stuff that seem singular related is:

No I hadn't applied that patch to my branch. However, after doing so sageinspect.py fails the same way on both x86 and amd64. Now here on sage-on-gentoo, where #11734 has been applied, sageinspect.py fails on amd64 but passed on x86 with the new refcount patches. This probably doesn't belong with this ticket even though there appears to be some connection.

comment:68 in reply to: ↑ 67 Changed 8 years ago by strogdon

Replying to strogdon:

Replying to fbissey:

Still about sageinspect.py, this is stuff added in #11298, did you apply #11734 to your branch? My own list of stuff that seem singular related is:

No I hadn't applied that patch to my branch. However, after doing so sageinspect.py fails the same way on both x86 and amd64. Now here on sage-on-gentoo, where #11734 has been applied, sageinspect.py fails on amd64 but passed on x86 with the new refcount patches. This probably doesn't belong with this ticket even though there appears to be some connection.

I was mistaken, sageinspect.py also fails here on x86 [sage-on-gentoo] with the new refcount patches and the patch from #11734.

comment:69 Changed 8 years ago by fbissey

I pushed Volker's latest patches for 4.7.2.alpha2 only 2 hours ago, before that I had reverted to Martin's initial patch. So mistake was easy.

comment:70 Changed 8 years ago by malb

I fixed (most?) of the issues discussed here over at #10903. Unfortunately, I mixed my fixes up with the general update to Singular 3-1-4 (not yet released). In any case, the two tickets are effectively merged. My bad!

comment:71 follow-up: Changed 8 years ago by malb

Patches look good though, so I'll give this a positive review once we fixed all the bugs uncovered by them in #10902.

comment:72 in reply to: ↑ 71 ; follow-up: Changed 8 years ago by malb

Replying to malb:

Patches look good though, so I'll give this a positive review once we fixed all the bugs uncovered by them in #10902.

#10903!

comment:73 in reply to: ↑ 72 Changed 8 years ago by strogdon

Replying to malb:

Replying to malb:

Patches look good though, so I'll give this a positive review once we fixed all the bugs uncovered by them in #10902.

#10903!

With the singular SPKG and the patches (both of them) from #10903 I get no segfaults for the long tests on both x86 and amd64. However, there is one test on x86 that usually takes a very long time that now fails. It may be singular-related.

sage -t -long  -force_lib devel/sage-singular.16054/sage/schemes/elliptic_curves/ell_rational_field.py
  ***   Warning: new stack size = 1003360 (0.957 Mbytes).
  ***   Warning: new stack size = 1003360 (0.957 Mbytes).
Saturation index bound = 265
WARNING: saturation at primes p > 97 will not be done;
points may be unsaturated at primes between 97 and index bound
Failed to saturate MW basis at primes [ ]
Saturation index bound = 265
WARNING: saturation at primes p > 199 will not be done;
points may be unsaturated at primes between 199 and index bound
Failed to saturate MW basis at primes [ ]
Saturation index bound = 265
WARNING: saturation at primes p > 97 will not be done;
points may be unsaturated at primes between 97 and index bound
Failed to saturate MW basis at primes [ ]
Saturation index bound = 265
WARNING: saturation at primes p > 199 will not be done;
points may be unsaturated at primes between 199 and index bound
Failed to saturate MW basis at primes [ ]
After 10 attempts at enlargement, giving up!
--points not proved 2-saturated,
2-saturation failed!
Failed to saturate MW basis at primes [ 2 ]
2 After 10 attempts at enlargement, giving up!
--points not proved 2-saturated,
2-saturation failed!
Failed to saturate MW basis at primes [ 2 ]
2 **********************************************************************
File "/storage/sage/sage-4.7.2.alpha2/devel/sage-singular.16054/sage/schemes/elliptic_curves/ell_rational_field.py", line 3884:
    sage: E1.isogeny_class(algorithm="sage")
Exception raised:
    Traceback (most recent call last):
      File "/storage/sage/sage-4.7.2.alpha2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/storage/sage/sage-4.7.2.alpha2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/storage/sage/sage-4.7.2.alpha2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_80[16]>", line 1, in <module>
        E1.isogeny_class(algorithm="sage")###line 3884:
    sage: E1.isogeny_class(algorithm="sage")
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_rational_field.py", line 3954, in isogeny_class
        isogs = E.isogenies_prime_degree(l_list)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_rational_field.py", line 4054, in isogenies_prime_degree
        return isogenies_sporadic_Q(self)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_curve_isogeny.py", line 4179, in isogenies_sporadic_Q
        return isogenies_sporadic_Q(E, Integer(163))
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_curve_isogeny.py", line 4162, in isogenies_sporadic_Q
        isog = Ew.isogeny(kernel=ker, degree=l, model="minimal", check=False)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_field.py", line 806, in isogeny
        return EllipticCurveIsogeny(self, kernel, codomain, degree, model, check=check)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_curve_isogeny.py", line 914, in __init__
        self.__init_from_kernel_polynomial(kernel, degree)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_curve_isogeny.py", line 2069, in __init_from_kernel_polynomial
        (phi, omega, v, w, n, d) = self.__init_odd_kernel_polynomial(E, psi)
      File "/storage/sage/sage-4.7.2.alpha2/local/lib/python/site-packages/sage/schemes/elliptic_curves/ell_curve_isogeny.py", line 2257, in __init_odd_kernel_polynomial
        psi_coeffs = psi.univariate_polynomial().list()
      File "multi_polynomial_libsingular.pyx", line 3369, in sage.rings.polynomial.multi_polynomial_libsingular.MPolynomial_libsingular.univariate_polynomial (sage/rings/polynomial/multi_polynomial_libsingular.cpp:22055)
    IndexError: list assignment index out of range

comment:74 Changed 8 years ago by malb

It looks singular related but it doesn't immediately ring a bell. I'd guess some strategically placed rChangeCurrRing would fix the issue, but the problem is to find where to put it.

comment:75 Changed 8 years ago by vbraun

  • Description modified (diff)
  • Status changed from needs_work to needs_review
  • Summary changed from Illegal use of __deallocate__ in cython (pyx) code to Refcounting for Cython rings

I am setting this bug to Needs Review. The two patches here will correctly refcount the Singular rings, and the reviewer should focus on that aspect. They also make Sage crash because of issues in the libSingular interface. The latter will be dealt with in #10903. Please report any crashes in Sage on #10903, and not here.

comment:76 Changed 8 years ago by vbraun

  • Summary changed from Refcounting for Cython rings to Refcounting for Singular rings

comment:77 Changed 8 years ago by malb

  • Reviewers changed from François Bissey, Steven Trogdon to François Bissey, Steven Trogdon, Martin Albrecht
  • Status changed from needs_review to positive_review

I read the patch and we're reasonably certain we caught all bugs these patches uncovered (not introduced!), so: positive review.

comment:78 Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_work
  • Work issues changed from Rebase on Sage 4.7.2.alpha2. to Conflict with #11856

This seems to conflict with #11856.

comment:79 follow-up: Changed 8 years ago by SimonKing

  • Status changed from needs_work to positive_review

It seems that it will be easier to rebase my patch from #11892. So, I reinstate the positive review and will change #11892.

comment:80 in reply to: ↑ 79 Changed 8 years ago by SimonKing

Replying to SimonKing:

It seems that it will be easier to rebase my patch from #11892. So, I reinstate the positive review and will change #11892.

Sorry, I meant to change #11856...

comment:81 Changed 8 years ago by jdemeyer

  • Work issues Conflict with #11856 deleted

comment:82 Changed 8 years ago by SimonKing

For the record: The conflicts with #11856 and also another conflict with #11068 were caused by cosmetic changes introduced by the first patch. Both are resolved now.

comment:83 Changed 8 years ago by SimonKing

It is amazing how many patches of mine fail for trivial reasons because of this ticket. E.g., introspection: One of my examples for introspection of code fails, because __cinint__ was introduced. So, in addition to #11856 and #11068, I'll have to change at least a third patch as well.

comment:84 Changed 8 years ago by SimonKing

  • Status changed from positive_review to needs_work

You should run the tests with your patches applied on top of sage-4.7.2.alpha3. At least with the alpha3-prerelease, I get some doctest errors, for example sage -t "devel/sage-main/sage/misc/sageinspect.py".

So, it seems it would have been better to change this patch, not #11856 and #11068. Too bad.

comment:85 follow-ups: Changed 8 years ago by vbraun

  • Status changed from needs_work to needs_review

The doctest failure in sage/misc/sageinspect (would have been nice if it would have used its own module as a doctest example btw) is fixed in #10903. There are no doctest failures if you use sage-4.7.2.alpha3 + #11339 + #10903. Really the two tickets should be one but thats not how it panned out historically.

comment:86 in reply to: ↑ 85 Changed 8 years ago by leif

Replying to vbraun:

The doctest failure in sage/misc/sageinspect (would have been nice if it would have used its own module as a doctest example btw) is fixed in #10903. There are no doctest failures if you use sage-4.7.2.alpha3 + #11339 + #10903. Really the two tickets should be one but thats not how it panned out historically.

So we have a cyclic dependency now...

Somehow matches the ticket's title.

comment:87 Changed 8 years ago by SimonKing

Question to the release manager: How are the odds that both #11339/#10903 and #11856 (which both fix critical bugs) will be merged into sage-4.7.2?

If #11339/#10903 are ready to be merged then I guess one can easily modify #11856 so that it matches (it is just a different line break in one hunk). And #11068 was shifted to sage-4.7.3 anyway, so, it can wait.

comment:88 in reply to: ↑ 85 Changed 8 years ago by SimonKing

Replying to vbraun:

The doctest failure in sage/misc/sageinspect (would have been nice if it would have used its own module as a doctest example btw) ...

The point was to have a cdef class, but sageinspect is pure Python, so it wouldn't have been a good example.

comment:89 follow-up: Changed 8 years ago by vbraun

Jeroen said that Singular will go into sage-4.7.2 on sage-release. This should be #11339 + #10903 + #11856, preferably in that order.

comment:90 in reply to: ↑ 89 ; follow-up: Changed 8 years ago by jdemeyer

Replying to vbraun:

Jeroen said that Singular will go into sage-4.7.2 on sage-release. This should be #11339 + #10903 + #11856, preferably in that order.

Exactly. Just make sure you clearly state the order in which patches have to be applied. The "cyclic" dependency is not really a problem as these tickets will be merged together anyway.

comment:91 in reply to: ↑ 90 Changed 8 years ago by leif

Replying to jdemeyer:

The "cyclic" dependency is not really a problem as these tickets will be merged together anyway.

I'll implement merging strongly connected components... ;-)

comment:92 Changed 8 years ago by SimonKing

  • Status changed from needs_review to positive_review

Great, #11339 together with #10903 really seem to fix it. So, let it be a positive review, then. I will rebase #11856 now, and #11068 can wait a little.

comment:93 Changed 8 years ago by SimonKing

For the record: I made #11856 depend on #10903. So, it should all be good now...

comment:94 Changed 8 years ago by jdemeyer

  • Dependencies set to #10903 (for doctests)

comment:95 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.7.2.alpha4
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.