#9678 closed task (fixed)
Rewrite interrupt handling
Reported by: | jdemeyer | Owned by: | jdemeyer |
---|---|---|---|
Priority: | blocker | Milestone: | sage-4.7 |
Component: | c_lib | Keywords: | interrupt, error, c, cython |
Cc: | leif | Merged in: | sage-4.7.alpha1 |
Authors: | Jeroen Demeyer | Reviewers: | Volker Braun |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
There are lots of things to be improved in the interrupt handling routines in c_lib/src/interrupt.c and c_lib/include/interrupt.h.
Major changes planned:
- DONE: make
sig_on()
have function syntax so that we can declare itcdef int sig_on() except 0
. See #10115 for the syntax changes. - DONE: do not save signals in
sigsetjmp()
(by giving a second argument of 0 instead of 1). This speeds up a sig_on/sig_off loop from 382 clock cycles to 30 clock cycles on a Core(TM)2 Duo CPU T5870 @ 2.00GHz running Linux 2.6.34 glibc 2.11.2. - DONE: using
sigaction()
instead ofsignal()
since that has more well-defined semantics. - DONE: handle SIGINT differently from other signals (other signals are urgent and cannot be ignored. SIGINT on the other hand does not need to handled immediately, but we have to be careful for race conditions).
- DONE: allow
sig_on()
andsig_off()
to be nested. - DONE: implement
sig_retry()
for retrying failed computations (this is useful for PARI, see #10018). - DONE: have an interface for more general errors which are not signals. This can then be used by PARI, NTL and possibly other C libraries (various tickets).
- DONE: clean up old, unused code.
- DONE: testing interrupt handling: #10030.
- DONE: fix breakage because of this patch: #10061.
- DONE: documentation: #10109.
- DONE: Block interrupts during malloc: #10258.
- DONE: eliminate race condition when a SIGINT arrives before
sig_on()
or duringsig_on()
.
Other potentially related tickets:
- #800 (make _sig_on and _sig_off faster when stacked)
- #9640 (Change PARI error catching mechanism)
- #9564 (libsingular exponentiation can not be interrupted)
- #7879 (Remove unnecessary signal handling for low prec mpfr operations) --- hopefully
sig_on()
andsig_off()
can be made very fast such that this shouldn't be an issue anymore. - #7794 (
PolynomialRing_integral_domain
ignores Ctrl-C and segfaults) - #5313 (patch singular so that when it runs out of memory the error message says "singular" in it)
- #7702 (Handle interrupts better in the notebook)
- #3423 (Make Pari error messages more informative)
- #10126 (Fix error handing in Matrix_rational_dense._invert_pari())
- #10818 (EclLib? should allow signals to make LISP code interruptable)
Attachments (1)
Change History (47)
comment:1 Changed 12 years ago by
- Component changed from misc to c_lib
- Owner changed from jason to tba
comment:2 Changed 12 years ago by
- Cc leif added
comment:3 Changed 12 years ago by
- Description modified (diff)
comment:4 Changed 12 years ago by
- Description modified (diff)
comment:5 Changed 12 years ago by
- Description modified (diff)
comment:6 Changed 12 years ago by
- Description modified (diff)
comment:7 Changed 12 years ago by
- Description modified (diff)
comment:8 Changed 12 years ago by
- Description modified (diff)
comment:9 Changed 12 years ago by
- Description modified (diff)
comment:10 Changed 12 years ago by
- Description modified (diff)
comment:11 Changed 12 years ago by
- Description modified (diff)
comment:12 Changed 12 years ago by
- Description modified (diff)
comment:13 Changed 12 years ago by
- Description modified (diff)
comment:14 Changed 12 years ago by
- Description modified (diff)
comment:15 Changed 12 years ago by
- Description modified (diff)
comment:16 Changed 12 years ago by
- Description modified (diff)
comment:17 Changed 12 years ago by
- Description modified (diff)
- Status changed from new to needs_work
comment:18 Changed 12 years ago by
- Description modified (diff)
comment:19 Changed 12 years ago by
- Description modified (diff)
comment:20 Changed 12 years ago by
- Description modified (diff)
comment:21 Changed 12 years ago by
- Description modified (diff)
comment:22 Changed 12 years ago by
- Description modified (diff)
comment:23 Changed 12 years ago by
- Description modified (diff)
- Owner changed from tba to jdemeyer
comment:24 Changed 11 years ago by
- Description modified (diff)
comment:25 Changed 11 years ago by
- Description modified (diff)
comment:26 Changed 11 years ago by
- Description modified (diff)
comment:27 Changed 11 years ago by
- Description modified (diff)
comment:28 Changed 11 years ago by
- Milestone changed from sage-feature to sage-4.6.2
- Status changed from needs_work to needs_review
comment:29 Changed 11 years ago by
- Description modified (diff)
Changed 11 years ago by
comment:30 follow-up: ↓ 32 Changed 11 years ago by
- Status changed from needs_review to positive_review
Definite improvement! We should put it in the very next Sage-4.6.2.alpha to give it as much exposure as possible since it touches a couple of core C files.
comment:31 Changed 11 years ago by
- Reviewers set to Volker Braun
comment:32 in reply to: ↑ 30 Changed 11 years ago by
Replying to vbraun:
Definite improvement! We should put it in the very next Sage-4.6.2.alpha to give it as much exposure as possible since it touches a couple of core C files.
Alternatively, I am considering making 4.6.2 a pretty straightforward release and then putting various "big" tickets such as this one, #9433, #10572 in the next release which would then be sage-4.7.
comment:33 Changed 11 years ago by
- Milestone changed from sage-4.6.2 to sage-4.7
comment:34 Changed 11 years ago by
- Priority changed from major to blocker
comment:35 Changed 11 years ago by
I'm trying out Sage-4.6.2.rc0 in Fedora 14 i386 inside VirtualBox
, and it doesn't catch the SIGILL. Bactrace is below. Everything else works fine. Also, no such issue on my native Fedora 14 x86_64.
I'm not sure who is to blame here, could be something with the virtual machine. I did write a SIGILL handler in Perl and that works, so VirtualBox
is at least in principle able to have guests raise and handle SIGILL. Though there is also some forking involved...
The only thing thats special about SIGILL that comes to mind is that it might not be automatically be reset to SIG_DFL upon entrance. But I don't see how that could cause this. Here is my testcase:
[vbraun@localhost ~]$ sage -gdb ---------------------------------------------------------------------- | Sage Version 4.6.2.rc0, Release Date: 2011-02-18 | | Type notebook() for the GUI, and license() for information. | ---------------------------------------------------------------------- ********************************************************************** * * * Warning: this is a prerelease version, and it may be unstable. * * * ********************************************************************** /home/vbraun/Sage/sage/local/bin/sage-ipython GNU gdb (GDB) Fedora (7.2-41.fc14) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /mnt/sage/vbraun/sage-4.6.2.rc0/local/bin/python...done. [Thread debugging using libthread_db enabled] Python 2.6.4 (r264:75706, Feb 19 2011, 16:56:24) [GCC 4.5.1 20100924 (Red Hat 4.5.1-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Traceback (most recent call last): File "/usr/share/gdb/auto-load/usr/lib/libstdc++.so.6.0.14-gdb.py", line 59, in <module> from libstdcxx.v6.printers import register_libstdcxx_printers File "/usr/lib/../share/gcc-4.5.1/python/libstdcxx/v6/printers.py", line 19, in <module> import itertools ImportError: No module named itertools Detaching after fork from child process 10247.
So far, so good. Now run:
sage: import sage.tests.interrupt sage: sage.tests.interrupt.test_signal_ill() Detaching after fork from child process 10248. Program received signal SIGILL, Illegal instruction. 0x05dac6b1 in _sig_on_prejmp (__pyx_self=<value optimized out>, __pyx_args=0xb7fa702c, __pyx_kwds=0x0) at /home/vbraun/Sage/sage/local/include/csage/interrupt.h:192 192 _signals.sig_on_count++; Missing separate debuginfos, use: debuginfo-install expat-2.0.1-10.fc13.i686 fontconfig-2.8.0-2.fc14.i686 glibc-2.13-1.i686 keyutils-libs-1.2-6.fc12.i686 krb5-libs-1.8.2-8.fc14.i686 libcom_err-1.41.12-6.fc14.i686 libgcc-4.5.1-4.fc14.i686 libgfortran-4.5.1-4.fc14.i686 libselinux-2.0.96-6.fc14.1.i686 libstdc++-4.5.1-4.fc14.i686 ncurses-libs-5.7-9.20100703.fc14.i686 nss-softokn-freebl-3.12.9-2.fc14.i686 openssl-1.0.0d-1.fc14.i686 (gdb) l 187 fprintf(stderr, "sig_on (counter = %i) at %s:%i\n", _signals.sig_on_count+1, file, line); 188 fflush(stderr); 189 #endif 190 if (_signals.sig_on_count > 0) 191 { 192 _signals.sig_on_count++; 193 return 1; 194 } 195 196 /* At this point, _signals.sig_on_count == 0 */ (gdb) bt #0 0x05dac6b1 in _sig_on_prejmp (__pyx_self=<value optimized out>, __pyx_args=0xb7fa702c, __pyx_kwds=0x0) at /home/vbraun/Sage/sage/local/include/csage/interrupt.h:192 #1 __pyx_pf_4sage_5tests_9interrupt_test_signal_ill (__pyx_self=<value optimized out>, __pyx_args=0xb7fa702c, __pyx_kwds=0x0) at sage/tests/interrupt.c:2878 #2 0x0017c398 in PyCFunction_Call (func=0xaad686c, arg=0xb7fa702c, kw=0x0) at Objects/methodobject.c:85 #3 0x001de9dd in call_function (f=0x820c4d4, throwflag=0) at Python/ceval.c:3706 #4 PyEval_EvalFrameEx (f=0x820c4d4, throwflag=0) at Python/ceval.c:2389 #5 0x001e06e7 in PyEval_EvalCodeEx (co=0xaa5fd58, globals=0x83cf3e4, locals=0x83cf3e4, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2968 #6 0x001e0803 in PyEval_EvalCode (co=0xaa5fd58, globals=0x83cf3e4, locals=0x83cf3e4) at Python/ceval.c:522 #7 0x001deccb in exec_statement (f=0x81630ec, throwflag=0) at Python/ceval.c:4401 #8 PyEval_EvalFrameEx (f=0x81630ec, throwflag=0) at Python/ceval.c:1717 #9 0x001e06e7 in PyEval_EvalCodeEx (co=0x837aa88, globals=0x8370714, locals=0x0, args=0x8162c08, argcount=2, kws=0x8162c10, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2968 #10 0x001dea92 in fast_function (f=0x8162abc, throwflag=0) at Python/ceval.c:3802
comment:36 Changed 11 years ago by
Volker, do you have strace
installed? If so, you could try
$ mkdir trace $ strace -ff -o trace/sage ./sage -c 'import sage.tests.interrupt; sage.tests.interrupt.test_signal_ill()'
and send me the trace/
directory.
And maybe do the same for test_signal_bus()
to compare with.
comment:37 Changed 11 years ago by
Sure, no problem. I did strace ill, bus, and fpe and uploaded the trace/ directory to http://www.stp.dias.ie/~vbraun/trace.tar.bz2
comment:38 Changed 11 years ago by
Hold your horses! Its working now, and SIGILL is caught correctly. I did rebuild the sage library in the meantime, so I probably miscompiled something before.
comment:39 Changed 11 years ago by
In any case, based on your trace, it looks like everything is working correctly. I see the following:
--- SIGILL (Illegal instruction) @ 0 (0) --- rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 write(2, "Traceback (most recent call last"..., 35) = 35 write(2, " File \"/home/vbraun/Sage/sage/l"..., 74) = 74
So, the program receives SIGILL
and prints a Traceback, which is exactly what should happen.
I cannot understand why at some point it did not work for you...
comment:40 follow-up: ↓ 41 Changed 11 years ago by
After many more rebuilds I found the problem: On i386 (and only i386) the Parma Polyhedra library (PPL, #10039) tests if SSE is available. It does this during the library initialization by installing a SIGILL handler and then issuing SSE instructions. It then resets the handler to signal(SIGILL, SIG_DFL)
. Clearly, this deactivates Sage's SIGILL handler.
Since we don't use PPL's floating point arithmetics I'll make sure to configure it in a way that it does not test for SSE availability.
comment:41 in reply to: ↑ 40 Changed 11 years ago by
Replying to vbraun:
After many more rebuilds I found the problem: On i386 (and only i386) the Parma Polyhedra library (PPL, #10039) tests if SSE is available. It does this during the library initialization by installing a SIGILL handler and then issuing SSE instructions. It then resets the handler to
signal(SIGILL, SIG_DFL)
. Clearly, this deactivates Sage's SIGILL handler.
I consider this to be a bug in PPL. Using sigaction()
or signal()
, it is possible to save an existing signal handler and then restore it afterwards.
comment:42 Changed 11 years ago by
I agree and will report it upstream.
comment:43 Changed 11 years ago by
- Merged in set to sage-4.7.alpha1
- Resolution set to fixed
- Status changed from positive_review to closed
comment:44 Changed 11 years ago by
- Description modified (diff)
comment:45 Changed 11 years ago by
- Description modified (diff)
comment:46 Changed 7 years ago by
- Description modified (diff)
Author of what? ;-)