Opened 5 years ago
Closed 4 years ago
#21628 closed defect (duplicate)
AbstractLinearCode.canonical_representative not properly wrapped in sig_on and sig_off
Reported by: | jsrn | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | coding theory | Keywords: | cython, linear_code |
Cc: | tfeulner | Merged in: | |
Authors: | Reviewers: | Johan Rosenkilde, Jeroen Demeyer | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/jsrn/21628_canonical_representative_interrupting (Commits, GitHub, GitLab) | Commit: | 40269b97a50f39e55cc6413a6b49eecafcb565ff |
Dependencies: | #21651 | Stopgaps: |
Description
sage: C = LinearCode(random_matrix(GF(47), 12, 43)) sage: C.canonical_representative() <wait half an eternity or try to use C-c>
Pressing C-c
either seems to have no effect, throws an exception which is not caught and computation continues, or segfaults.
This is very unfortunate because this is a function commonly tried but interrupted due to long run-time.
Change History (17)
comment:1 Changed 5 years ago by
- Branch set to u/jsrn/21628_canonical_representative_interrupting
comment:2 Changed 5 years ago by
- Commit set to 40269b97a50f39e55cc6413a6b49eecafcb565ff
- Status changed from new to needs_review
comment:3 follow-up: ↓ 4 Changed 5 years ago by
- Status changed from needs_review to needs_work
The call to PartitionRefinementLinearCode
is a Python call, so sig_on()
is inappropriate. sig_on()
is only meant to wrap calls to external libraries that you have no control over.
In this case, you should use sig_check()
in the Cython code instead.
comment:4 in reply to: ↑ 3 ; follow-ups: ↓ 5 ↓ 6 Changed 5 years ago by
Replying to jdemeyer:
The call to
PartitionRefinementLinearCode
is a Python call, sosig_on()
is inappropriate.sig_on()
is only meant to wrap calls to external libraries that you have no control over.In this case, you should use
sig_check()
in the Cython code instead.
Thanks. Of course you're the mastermind behind cysignals, but the documentation of cysignals *does* say: "You should put sig_on() before and sig_off() after any Cython code which could potentially take a long time.". Which is why I thought it would be OK to do that.
Further, removing that sig_on/sig_off
pair, I get the following segfault when interrupting:
------------------------------------------------------------------------ Unhandled SIGSEGV: A segmentation fault occurred. This probably occurred because a *compiled* module has a bug in it and is not properly wrapped with sig_on(), sig_off(). Python will now terminate. ------------------------------------------------------------------------
Further leading me to believe sig_on/sig_off
was the way to go.
I've tried to look closer at PartitionRefinementLinearCode
: it quickly calls into sage/groups/perm_gps/partn_ref2/refinement_generic.pyx
which is a very complicated, mutually recursive backtracking algorithm, and it's not very clear what the "expensive loop" is. Exactly why is it not OK to use sig_on/sig_off
in such a case?
Also, it's not clear to me why Sage segfaults on C-c
since all the code I can see is just Cython code. If this is expected behaviour, why is the following message printed if sig_on/sig_off
is not the right way to solve such an issue?
Best, Johan
comment:5 in reply to: ↑ 4 ; follow-up: ↓ 7 Changed 5 years ago by
Well, the cysignals documentation also has a warning box with
The code inside
sig_on()
should be pure C or Cython code. If you call any Python code or manipulate any Python object (even something trivial likex = []
), an interrupt can mess up Python’s internal state. When in doubt, try to usesig_check()
instead.
comment:6 in reply to: ↑ 4 Changed 5 years ago by
Replying to jsrn:
Further, removing that
sig_on/sig_off
pair, I get the following segfault when interrupting:------------------------------------------------------------------------ Unhandled SIGSEGV: A segmentation fault occurred. This probably occurred because a *compiled* module has a bug in it and is not properly wrapped with sig_on(), sig_off(). Python will now terminate. ------------------------------------------------------------------------
That's probably an independent problem. I strongly suggest that you first fix this segfault before moving on. It is pointless to build on top of a broken foundation.
Do you have a GDB backtrace (if you have gdb installed, such a backtrace will be shown automatically) or a way for me to reproduce that segfault?
comment:7 in reply to: ↑ 5 Changed 5 years ago by
Replying to jdemeyer:
Well, the cysignals documentation also has a warning box with
The code inside
sig_on()
should be pure C or Cython code. If you call any Python code or manipulate any Python object (even something trivial likex = []
), an interrupt can mess up Python’s internal state. When in doubt, try to usesig_check()
instead.
Haha, good point :-)
That's probably an independent problem. I strongly suggest that you first fix this segfault before moving on. It is pointless to build on top of a broken foundation.
Do you have a GDB backtrace (if you have gdb installed, such a backtrace will be shown automatically) or a way for me to reproduce that segfault?
Hmm, OK. What I run to produce the segfault is the following:
sage: C = LinearCode(random_matrix(GF(47), 25, 35)) sage: C.canonical_representative() < Hit C-c and watch the fireworks >
comment:8 follow-up: ↓ 9 Changed 5 years ago by
The backtrace from gdb is:
/home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/cysignals/signals.so(+0x43c5)[0x7f872e4e73c5] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/cysignals/signals.so(+0x4415)[0x7f872e4e7415] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/cysignals/signals.so(+0x6f07)[0x7f872e4e9f07] /usr/lib/libpthread.so.0(+0x11080)[0x7f87369d8080] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/coding/codecan/codecan.so(+0xcec0)[0x7f856f699ec0] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/groups/perm_gps/partn_ref2/refinement_generic.so(+0x10150)[0x7f857b071150] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/coding/codecan/codecan.so(+0x2caf1)[0x7f856f6b9af1] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(+0xbe6b5)[0x7f8736ca26b5] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/coding/codecan/autgroup_can_label.so(+0x5c90)[0x7f856f8f8c90] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/coding/codecan/autgroup_can_label.so(+0x2a74f)[0x7f856f91d74f] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f8736c36e53] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(+0x6403c)[0x7f8736c4803c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f8736c36e53] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7f8736ce69d7] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyInstance_New+0x6c)[0x7f8736c4fe1c] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/coding/codecan/autgroup_can_label.so(+0x5c90)[0x7f856f8f8c90] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/coding/codecan/autgroup_can_label.so(+0x34d19)[0x7f856f927d19] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f8736c36e53] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(+0x6403c)[0x7f8736c4803c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f8736c36e53] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7f8736ce69d7] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyInstance_New+0x6c)[0x7f8736c4fe1c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f8736c36e53] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3adf)[0x7f8736ceaa7f] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(+0x8390d)[0x7f8736c6790d] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/misc/cachefunc.so(+0xf563)[0x7f872ce17563] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/misc/cachefunc.so(+0xaeb0)[0x7f872ce12eb0] /home/jsrn/local/sage/sage_devel/local/lib/python2.7/site-packages/sage/misc/cachefunc.so(+0x1f946)[0x7f872ce27946] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f8736c36e53] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3adf)[0x7f8736ceaa7f] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x7f8736cedc39] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5d60)[0x7f8736cecd00] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5075)[0x7f8736cec015] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x8bc)[0x7f8736cedb3c] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x7f8736cedc39] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyRun_FileExFlags+0x8a)[0x7f8736d1227a] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0xd5)[0x7f8736d13655] /home/jsrn/local/sage/sage_devel/local/lib/libpython2.7.so.1.0(Py_Main+0xc6f)[0x7f8736d29daf] /usr/lib/libc.so.6(__libc_start_main+0xf1)[0x7f8735f3e291] python(_start+0x2a)[0x40067a]
comment:9 in reply to: ↑ 8 ; follow-up: ↓ 10 Changed 5 years ago by
Replying to jsrn:
The backtrace from gdb is
No, it's not :-)
That's the backtrace from glibc. The backtrace from gdb should be below that.
comment:10 in reply to: ↑ 9 ; follow-up: ↓ 11 Changed 5 years ago by
comment:11 in reply to: ↑ 10 Changed 5 years ago by
Replying to jsrn:
Ah ok. It says "Attaching gdb to process id 10318." and then "Saved trace to <blah>". But that log file is empty.
Probably you don't have gdb or it's the wrong version...
comment:12 Changed 5 years ago by
I do have gdb since gdb launches when I type gdb. It's not hard to test.
How do I know if it's "the right version"? It's 7.11.1. My gcc is 6.1.1. Presumably they fit together since they're in my package manager?
comment:13 Changed 5 years ago by
- Dependencies set to #21651
comment:14 Changed 5 years ago by
I created #21651 for the existing segfault issue.
comment:15 Changed 4 years ago by
As far as I can tell, this issue is completely fixed by #21651.
comment:16 Changed 4 years ago by
- Milestone changed from sage-7.4 to sage-duplicate/invalid/wontfix
Yes, I think so. I tried some days ago to make it crash again, but was unable. I wanted to get back to it and try harder, but never got around to it. Since you also tried since then, it's probably true that the issue got fixed.
Let's close it.
comment:17 Changed 4 years ago by
- Resolution set to duplicate
- Reviewers set to Johan Rosenkilde, Jeroen Demeyer
- Status changed from needs_work to closed
I've added calls to
sig_check
in the constructorLinearCodeAutGroupCanLabel
(which might call itself recursively), as well assig_on/off
the call toPartitionRefinementLinearCode
. The latter is, AFAICT, the main work-horse of the algorithm. Both are Cython, but there's more C-like code inPartitionRefinementLinearCode
(incodecan.pyx
), so this might be why it used to behave so badly when interrupted.I'm not exactly sure how the interruption works, but it now seems to me to be responsive and not crash. I also don't know if there might be memory leaks left after the interruption :-S
New commits:
Insert sig_on/off around expensive check and sig_check on recursive calls