Opened 10 years ago
Closed 6 years ago
#11814 closed defect (fixed)
Catch and fix the segmentation fault in dlx_solver
Reported by:  jdemeyer  Owned by:  sagecombinat 

Priority:  critical  Milestone:  sage6.8 
Component:  combinatorics  Keywords:  
Cc:  slabbe  Merged in:  
Authors:  Sébastien Labbé  Reviewers:  Jeroen Demeyer 
Report Upstream:  N/A  Work issues:  
Branch:  dcdfd06 (Commits, GitHub, GitLab)  Commit:  dcdfd06c1e924a72a49cf0c13929930ddcb61e08 
Dependencies:  Stopgaps: 
Description (last modified by )
On a 64bit Gentoo Linux system running sage4.7.2.alpha2:
sage: from sage.combinat.matrices.dancing_links import dlx_solver sage: rows = [] sage: x = dlx_solver(rows) sage: x.search() /usr/local/src/sage4.7.2.alpha2/local/lib/libcsage.so(print_backtrace+0x31)[0x7f6c37015e02] /usr/local/src/sage4.7.2.alpha2/local/lib/libcsage.so(sigdie+0x14)[0x7f6c37015e34] /usr/local/src/sage4.7.2.alpha2/local/lib/libcsage.so(sage_signal_handler+0x20c)[0x7f6c37015a82] /lib/libpthread.so.0(+0xf400)[0x7f6c3c26e400] /usr/local/src/sage4.7.2.alpha2/local/lib/python2.6/sitepackages/sage/combinat/matrices/dancing_links.so(+0x738b)[0x7f6c1612f38b] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x4c32)[0x7f6c3c55d842] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCode+0x32)[0x7f6c3c55f912] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x4ccb)[0x7f6c3c55d8db] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x5364)[0x7f6c3c55df74] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x5364)[0x7f6c3c55df74] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x5a8a)[0x7f6c3c55e69a] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x5364)[0x7f6c3c55df74] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x5364)[0x7f6c3c55df74] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x5364)[0x7f6c3c55df74] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x879)[0x7f6c3c55f819] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyEval_EvalCode+0x32)[0x7f6c3c55f912] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyRun_FileExFlags+0xb0)[0x7f6c3c581d20] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(PyRun_SimpleFileExFlags+0xdf)[0x7f6c3c58275f] /usr/local/src/sage4.7.2.alpha2/local/lib/libpython2.6.so.1.0(Py_Main+0xb23)[0x7f6c3c58fa23] /lib/libc.so.6(__libc_start_main+0xe6)[0x7f6c3b897ba6] python[0x4006e9]  Unhandled SIGSEGV: A segmentation fault occurred in Sage. This probably occurred because a *compiled* component of Sage has a bug in it and is not properly wrapped with sig_on(), sig_off(). You might want to run Sage under gdb with 'sage gdb' to debug this. Sage will now terminate. 
Note that this behaviour is actually documented in sage/combinat/tiling.py
:
def is_suitable(self): r""" Return whether the volume of the box is equal to sum of the volume of the polyominoes and the number of rows sent to the DLX solver is larger than zero. If these conditions are not verified, then the problem is not suitable in the sense that there are no solution. .. NOTE:: The DLX solver throws a Segmentation Fault when the number of rows is zero:: sage: from sage.combinat.matrices.dancing_links import dlx_solver sage: rows = [] sage: x = dlx_solver(rows) sage: x.search() # not tested BOOM !!!
For sageongentoo, there is an assertion failure:
sage: x = dlx_solver(rows) python2.7: sage/combinat/matrices/dancing_links_c.h:217: void dancing_links::setup_columns(): Assertion `nr_columns > 0' failed. /usr/lib64/libcsage.so(print_backtrace+0x24)[0x7fea73a945f7] /usr/lib64/libcsage.so(sigdie+0x1d)[0x7fea73a94687] /usr/lib64/libcsage.so(sage_signal_handler+0x157)[0x7fea73a94822] /lib64/libpthread.so.0(+0xfee0)[0x7fea79206ee0] /lib64/libc.so.6(gsignal+0x35)[0x7fea78ea5ee5] /lib64/libc.so.6(abort+0x186)[0x7fea78ea7896] /lib64/libc.so.6(__assert_fail+0xf5)[0x7fea78e9e7a5] /usr/lib64/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x855d)[0x7fea4e73255d]
Additionally, we can segfault with a different kind of trivial problem,
sage: from sage.combinat.matrices.dancing_links import dlx_solver sage: d = dlx([[]]) sage: d.search() *boom!*
And even adding rows can be a problem... this doesn't always go boom
sage: d = dlx_solver([[0,1],[2]]) sage: d.add_rows([[0,1],[1]]) *boom!*
Actually... *why the hell* would we ever support the following:
sage: d = dlx_solver(anything) sage: d.search() sage: d.add_rows(anything else)
Change History (43)
comment:1 Changed 10 years ago by
comment:2 Changed 10 years ago by
 Description modified (diff)
comment:3 Changed 9 years ago by
 Cc slabbe added
comment:4 Changed 8 years ago by
 Milestone changed from sage5.11 to sage5.12
comment:5 Changed 8 years ago by
Amusingly I found this in add_rows
The following example would crash in Sage's debug version from :trac:`13864` prior to the fix from :trac:`13822`:: sage: from sage.combinat.matrices.dancing_links import dlx_solver sage: x = dlx_solver([]) # indirect doctest sage: x.get_solution() []
comment:6 Changed 8 years ago by
 Description modified (diff)
comment:7 Changed 7 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:8 Changed 7 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:9 Changed 7 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:10 followup: ↓ 12 Changed 6 years ago by
 Branch set to u/slabbe/11814
 Commit set to 8ee8dd81d654d6d68baa335b15251a2427c3dc8e
I just pushed a branch (based on 6.6.rc2) where the segmentation fault is catch using sig_on and sig_off.
Do we want to also solve the segmentation fault itself in this ticket?
New commits:
8ee8dd8  Trac #11814: Catch the segmentation fault in dancing links

comment:11 Changed 6 years ago by
 Commit changed from 8ee8dd81d654d6d68baa335b15251a2427c3dc8e to 25741dfc1c52709faa59042024248824247275f4
Branch pushed to git repo; I updated commit sha1. New commits:
25741df  Trac #11814: Merge with 6.8.rc1

comment:12 in reply to: ↑ 10 ; followup: ↓ 16 Changed 6 years ago by
Replying to slabbe:
I just pushed a branch (based on 6.6.rc2) where the segmentation fault is catch using sig_on and sig_off.
Do we want to also solve the segmentation fault itself in this ticket?
First of all, I would absolutely never doctest segfaults unless you know that it's safe to do so.
I don't mind just adding the sig_on
/sig_off
in this ticket provided that you create a new ticket for the segmentation fault itself.
comment:13 Changed 6 years ago by
Also: add your author name please.
comment:14 Changed 6 years ago by
 Commit changed from 25741dfc1c52709faa59042024248824247275f4 to 35f074c4a0a48af5583582e25fe76d0483ca7778
Branch pushed to git repo; I updated commit sha1. New commits:
35f074c  Trac #11814: Added not tested flag for segmentation fault

comment:15 Changed 6 years ago by
 Status changed from new to needs_review
 Summary changed from Segmentation fault in dlx_solver to Catch the segmentation fault in dlx_solver
comment:16 in reply to: ↑ 12 Changed 6 years ago by
comment:17 Changed 6 years ago by
 Status changed from needs_review to needs_work
comment:18 followup: ↓ 20 Changed 6 years ago by
Why needs_work?
comment:19 Changed 6 years ago by
 Commit changed from 35f074c4a0a48af5583582e25fe76d0483ca7778 to 549575955408f4ca4c8c24341c6c8149d14e4a1d
Branch pushed to git repo; I updated commit sha1. New commits:
5495759  Trac #11814: Fix segmentation Fault raised by an assert

comment:20 in reply to: ↑ 18 Changed 6 years ago by
comment:21 Changed 6 years ago by
 Status changed from needs_work to needs_review
comment:22 Changed 6 years ago by
I also removed the dumps
method which was broken and useless:
sage: x = dlx_solver([]) sage: x.dumps() Traceback (most ... ... AttributeError: 'list' object has no attribute 'dumps'
I was about to remove make_dlxwrapper
from the sage namespace, but decided it belongs to another ticket.
comment:23 Changed 6 years ago by
 Summary changed from Catch the segmentation fault in dlx_solver to Catch and fix the segmentation fault in dlx_solver
comment:24 Changed 6 years ago by
 Milestone changed from sage6.4 to sage6.8
 Reviewers set to Jeroen Demeyer
 Status changed from needs_review to needs_work
The initialization of self.rows
is a pure Pythonlevel initialization, so I recommend to move it to __init__
instead of __cinit__
.
comment:25 Changed 6 years ago by
 Branch changed from u/slabbe/11814 to u/jdemeyer/11814
comment:26 Changed 6 years ago by
 Commit changed from 549575955408f4ca4c8c24341c6c8149d14e4a1d to 392b62c70d342679c41799e0b45857c537ad8a8e
 Status changed from needs_work to needs_review
I took the opportunity to fix the initialization even more and really clean up the whole file.
New commits:
392b62c  Simplify dancing links code

comment:27 Changed 6 years ago by
 Branch changed from u/jdemeyer/11814 to u/slabbe/11814
 Commit changed from 392b62c70d342679c41799e0b45857c537ad8a8e to 3a3dd8be8ff50d04bc84dc60401ea9af11871af6
Great! I was feeling those Py_INCREF
were not necessary and old fashion. You were the good guy to clean/update that part. I removed the cimport Integer
which was unused.
There is one doctest failing in combinat/tiling.py
because of the __str__ > __repr__
change. I did not fix the doctest because I wonder if showing a list of list is really what we want for the __repr__
? Maybe a repr saying Dancing links of 34 rows
would be better together with a method rows
returning the rows?
New commits:
3a3dd8b  Removed cimport Integer from dancing links

comment:28 Changed 6 years ago by
Note: I started to fix the repr thing. Will push something soon.
comment:29 followup: ↓ 31 Changed 6 years ago by
I would prefer to see the rows in __repr__
, so maybe Dancing links of 2 rows [0,1,2], [0,2]
.
Note: the only reason I changed it is because it looked strange to me that the class had a __str__
method but not a __repr__
.
comment:30 Changed 6 years ago by
 Commit changed from 3a3dd8be8ff50d04bc84dc60401ea9af11871af6 to 56c74ed227fb8def88644a7e2371bd33701d8f28
Branch pushed to git repo; I updated commit sha1. New commits:
56c74ed  Better __repr__ for dancing links, added rows method

comment:31 in reply to: ↑ 29 Changed 6 years ago by
With the last commit we now have:
sage: from sage.games.quantumino import QuantuminoSolver sage: q = QuantuminoSolver(0) sage: T = q.tiling_solver() sage: T.dlx_solver() Dancing links solver for 96 columns and 5484 rows
I think it is better like this instead of printing the whole list of rows.
comment:32 Changed 6 years ago by
There is a doctest failure in src/sage/games/quantumino.py
related to the printing.
comment:33 Changed 6 years ago by
 Commit changed from 56c74ed227fb8def88644a7e2371bd33701d8f28 to e2b82a4b7b035d84955c3c15a41cd740fc050eaf
Branch pushed to git repo; I updated commit sha1. New commits:
e2b82a4  Fix doctest in games/quantumino.py

comment:34 Changed 6 years ago by
Just to say that I agree with the changes made by Jeroen in this ticket.
comment:35 Changed 6 years ago by
 Status changed from needs_review to positive_review
comment:36 Changed 6 years ago by
 Status changed from positive_review to needs_work
When building with SAGE_DEBUG=yes
:
sage: from sage.combinat.matrices.dancing_links import dlx_solver ## line 163 ## sage: x = dlx_solver([]) # indirect doctest ## line 164 ## python: /mnt/disk/home/buildslavesage/slave/sage_git/build/src/build/cythonized/sage/combinat/matrices/dancing_links_c.h:217: void dancing_links::setup_columns(): Assertion `nr_columns > 0' failed.  /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/ext/interrupt/interrupt.so(+0x3747)[0x7f012e49a747] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/ext/interrupt/interrupt.so(+0x3902)[0x7f012e49a902] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/ext/interrupt/interrupt.so(+0x3257)[0x7f012e49a257] /lib64/libpthread.so.0(+0x10430)[0x7f013a8b8430] /lib64/libc.so.6(gsignal+0x38)[0x7f0139e0d9c8] /lib64/libc.so.6(abort+0x16a)[0x7f0139e0f65a] /lib64/libc.so.6(+0x2d187)[0x7f0139e06187] /lib64/libc.so.6(+0x2d232)[0x7f0139e06232] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(_ZN13dancing_links13setup_columnsEv+0x36)[0x7f00f9ec3fe8] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(_ZN13dancing_links8add_rowsESt6vectorIS0_IiSaIiEESaIS2_EE+0x153)[0x7f00f9ec4335] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x1301a)[0x7f00f9ebf01a] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x121c9)[0x7f00f9ebe1c9] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x161b6)[0x7f00f9ec21b6] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x16378)[0x7f00f9ec2378] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x1091e)[0x7f00f9ebc91e] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x106dd)[0x7f00f9ebc6dd] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(+0xdf3ed)[0x7f013aba33ed] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x160bb)[0x7f00f9ec20bb] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x13c13)[0x7f00f9ebfc13] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/combinat/matrices/dancing_links.so(+0x13ac4)[0x7f00f9ebfac4] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(+0x15a60b)[0x7f013ac1e60b] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7c85)[0x7f013ac19290] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x113f)[0x7f013ac1bd59] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x41)[0x7f013ac115e6] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(+0x15d369)[0x7f013ac21369] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x432d)[0x7f013ac15938] /mnt/disk/home/buildslavesage/slave/sage_git/build/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x113f)[0x7f013ac1bd59]
comment:37 Changed 6 years ago by
Sébastien, I won't have time in the coming month to look at this. If other people want to fix and to review, go ahead...
comment:38 Changed 6 years ago by
No problem Jeroen!
So how can I reproduce the problem? Like this?
make distclean export SAGE_DEBUG="yes" make sage t src/sage/combinat/matrices/dancing_links.pyx
or is there an easier way? Can I just recompile some packages (python and cython and gcc?) packages with SAGE_DEBUG=yes
? Does sage f <package>
considers the SAGE_DEBUG
flag?
comment:39 Changed 6 years ago by
You need to compile Python and everything that links to it, the only sure way is starting with make distclean imho.
comment:40 Changed 6 years ago by
 Commit changed from e2b82a4b7b035d84955c3c15a41cd740fc050eaf to dcdfd06c1e924a72a49cf0c13929930ddcb61e08
Branch pushed to git repo; I updated commit sha1. New commits:
dcdfd06  Trac #11814: Fix failing Assertion nr_columns > 0 in setup_columns()

comment:41 Changed 6 years ago by
 Status changed from needs_work to needs_review
comment:42 Changed 6 years ago by
 Status changed from needs_review to positive_review
comment:43 Changed 6 years ago by
 Branch changed from u/slabbe/11814 to dcdfd06c1e924a72a49cf0c13929930ddcb61e08
 Resolution set to fixed
 Status changed from positive_review to closed
A bit more information on the platform, or did you try this on different systems?
On Ubuntu 10.04.3 x86_64 I get: