Opened 8 years ago
Closed 6 years ago
#12690 closed defect (wontfix)
Signal handling doesn't properly handle OpenMP code
Reported by: | ohanar | Owned by: | jdemeyer |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | c_lib | Keywords: | |
Cc: | Merged in: | ||
Authors: | Reviewers: | Jeroen Demeyer | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
I was playing around with the using the new prange functionality in cython and happened across this when I tried to interrupt the execution:
sage: sage: time prime_pi(10**12) 37607912018 Time: CPU 1.25 s, Wall: 0.80 s time prime_pi(10**12) ^C/home/sage/5.0.beta8/local/lib/libcsage.so(print_backtrace+0x31)[0x7f9a6f5c0996] /home/sage/5.0.beta8/local/lib/libcsage.so(sigdie+0x14)[0x7f9a6f5c09c8] /home/sage/5.0.beta8/local/lib/libcsage.so(sage_signal_handler+0x17d)[0x7f9a6f5c0587] /lib64/libpthread.so.0(+0xfb80)[0x7f9a74c7bb80] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyErr_Restore+0x4c)[0x7f9a74f8a99c] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyErr_SetString+0x27)[0x7f9a74f8aad7] /home/sage/5.0.beta8/local/lib/libcsage.so(sage_signal_handler+0xed)[0x7f9a6f5c04f7] /lib64/libpthread.so.0(+0xfb80)[0x7f9a74c7bb80] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyErr_Restore+0x4c)[0x7f9a74f8a99c] /home/sage/5.0.beta8/local/lib/libcsage.so(sage_interrupt_handler+0x48)[0x7f9a6f5c03b0] /lib64/libpthread.so.0(+0xfb80)[0x7f9a74c7bb80] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x4099)[0x7f9a4e185099] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x3fe8)[0x7f9a4e184fe8] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x5095)[0x7f9a4e186095] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x58cd)[0x7f9a4e1868cd] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x7509)[0x7f9a4e188509] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x8c38)[0x7f9a4e189c38] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x7f9a74ed6b53] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(+0x4dc2b)[0x7f9a74ed6c2b] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyObject_CallMethod+0xc1)[0x7f9a74ed6f41] /home/sage/5.0.beta8/local/lib/libpynac-0.2.so.3(_ZNK5GiNaC8function4evalEi+0x4ce)[0x7f9a52c15e3e] /home/sage/5.0.beta8/local/lib/libpynac-0.2.so.3(_ZN5GiNaC2ex20construct_from_basicERKNS_5basicE+0x6e)[0x7f9a52c03b4e] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/symbolic/function.so(_Z16g_function_eval1jRKN5GiNaC2exEb+0xb7)[0x7f9a521da557] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/symbolic/function.so(+0x21def)[0x7f9a521e1def] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x7f9a74ed6b53] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7f9a74f740e7] /home/sage/5.0.beta8/local/lib/python2.7/site-packages/sage/functions/prime_pi.so(+0x939f)[0x7f9a4e18a39f] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x7f9a74ed6b53] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3fcd)[0x7f9a74f7869d] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f9a74f7b782] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x57bf)[0x7f9a74f79e8f] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x51f1)[0x7f9a74f798c1] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x51f1)[0x7f9a74f798c1] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5dc1)[0x7f9a74f7a491] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x51f1)[0x7f9a74f798c1] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x51f1)[0x7f9a74f798c1] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x51f1)[0x7f9a74f798c1] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x855)[0x7f9a74f7b645] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f9a74f7b782] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyRun_FileExFlags+0xb0)[0x7f9a74f9db20] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0xdf)[0x7f9a74f9e5bf] /home/sage/5.0.beta8/local/lib/libpython2.7.so.1.0(Py_Main+0xb85)[0x7f9a74fb1275] /lib64/libc.so.6(__libc_start_main+0xed)[0x7f9a74281fad] python[0x4006f1] ------------------------------------------------------------------------ An error occured during signal handling. This probably occurred because a *compiled* component of Sage has a bug in it and is not properly wrapped with sig_on(), sig_off(). You might want to run Sage under gdb with 'sage -gdb' to debug this. Sage will now terminate. ------------------------------------------------------------------------ /home/sage/5.0.beta8/spkg/bin/sage: line 308: 8428 Segmentation fault (core dumped) sage-ipython "$@" -i
I've attached a patch for the sage library that demonstrates the bug.
Attachments (2)
Change History (8)
Changed 8 years ago by
comment:1 Changed 8 years ago by
- Description modified (diff)
comment:2 Changed 7 years ago by
- Description modified (diff)
comment:3 Changed 6 years ago by
- Milestone changed from sage-5.11 to sage-5.12
comment:4 Changed 6 years ago by
- Milestone changed from sage-5.13 to sage-duplicate/invalid/wontfix
- Status changed from new to needs_review
- Summary changed from signal handling doesn't properly handle multithreaded code to Signal handling doesn't properly handle OpenMP code
Proposal: close as wontfix.
The problem is that Cython uses OpenMP. Even I could fix the signal handling code, it's not at all clear how to make OpenMP deal with this. The following makes me think that a fix is impossible:
- http://stackoverflow.com/questions/3914264/openmp-is-there-a-way-for-a-thread-to-terminate-all-other-parallel-threads
- http://stackoverflow.com/questions/8482651/how-do-i-conditionally-terminate-a-parallel-region-in-openmp
OpenMP simply isn't designed for this. With pthreads, it might be possible as it has clear specifications about how threads interact with signals.
comment:5 Changed 6 years ago by
Using sig_check()
inside the loop does work:
from cython.parallel import prange include 'ext/interrupt.pxi' def dumb_function(): cdef int i,x while True: for i in prange(1<<30,nogil=True): with gil: sig_check() x += i return x
From Cython's point of view, sig_check()
simply raises an exception when an interrupt occurred, which is explicitly allowed by http://docs.cython.org/src/userguide/parallelism.html#breaking-out-of-loops
The worst-case overhead is one loop iteration per thread, which is optimal given the limitations of OpenMP.
We should probably declare sig_check()
as nogil though, see #15352.
comment:6 Changed 6 years ago by
- Resolution set to wontfix
- Reviewers set to Jeroen Demeyer
- Status changed from needs_review to closed
prime_pi that causes the interrupt error