Opened 5 years ago
Closed 7 months ago
#22766 closed defect (invalid)
Trying completion list of maxima_lib crashes Sage
Reported by: | rws | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | sage-duplicate/invalid/wontfix |
Component: | interfaces | Keywords: | |
Cc: | jdemeyer | Merged in: | |
Authors: | Reviewers: | Dima Pasechnik | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #23956 | Stopgaps: |
Description (last modified by )
If Maxima's commands list is not stored, then initialising Maxima/ECL and then hitting TAB after maxima_lib
crashes Sage, as shown below. Other similar crashes may be triggered, see e.g. #23956.
The reason for these crashes is the design of tab completion in IPython
5+ using
prompt_toolkit, which uses Python threading, and does tab completion in a separate thread.
$ rm -f ~/.sage/maxima_commandlist_cache.sobj $ sage ┌────────────────────────────────────────────────────────────────────┐ │ SageMath version 8.0.beta0, Release Date: 2017-03-30 │ │ Type "notebook()" for the browser-based notebook interface. │ │ Type "help()" for help. │ └────────────────────────────────────────────────────────────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Warning: this is a prerelease version, and it may be unstable. ┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ sage: from sage.interfaces.maxima_lib import maxima_lib sage: maxima_lib. Building Maxima command completion list (this takes a few seconds only the first time you do it). To force rebuild later, delete /home/ralf/.sage//maxima_commandlist_cache.sobj. A ;;; ;;; Stack overflow. ;;; Jumping to the outermost toplevel prompt ;;; Internal or unrecoverable error in: ;;; ;;; No frame to jump to ;;; Aborting ECL ;;; ;;; ECL C Backtrace ;;; /home/ralf/sage/local/lib/libecl.so.16.1(si_dump_c_backtrace+0x26) [0x7f012] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_internal_error+0x3f) [0x7f0127] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(FEerror+0) [0x7f0127706f20] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(+0x1adb1a) [0x7f012772eb1a] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(+0x12b3c5) [0x7f01276ac3c5] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(cl_funcall+0x70) [0x7f01276e7c90] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(si_serror+0xd9) [0x7f0127708299] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_cs_overflow+0xac) [0x7f012772e] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(+0x12b3c5) [0x7f01276ac3c5] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(cl_funcall+0x70) [0x7f01276e7c90] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(si_serror+0xd9) [0x7f0127708299] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_cs_overflow+0xac) [0x7f012772e] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_interpret+0x1d67) [0x7f01276ea] ;;; /home/ralf/sage/local/lib/libecl.so.16.1(cl_apply+0x145) [0x7f01276e7e95] ;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0xcf1d)] ;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0x15511] ;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0x15d6b] ;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0x16d47] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(+0xc0c7f) [0x7f038ea92c7f] ;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0xcad8)] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43) [0x7f038e] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x56da) [0] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7] ;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(+0x87ecc) [0x7f038ea59ecc] Aborted (core dumped)
Change History (44)
comment:1 Changed 5 years ago by
comment:2 Changed 4 years ago by
comment:3 follow-up: ↓ 4 Changed 4 years ago by
On OS X (10.12.6, Xcode 8.3.3, Sage 8.1.beta6), I don't get this crash, and I also don't see the message "Building Maxima command completion list ...": I just immediately get a list of completions.
comment:4 in reply to: ↑ 3 Changed 4 years ago by
Replying to jhpalmieri:
On OS X (10.12.6, Xcode 8.3.3, Sage 8.1.beta6), I don't get this crash, and I also don't see the message "Building Maxima command completion list ...": I just immediately get a list of completions.
It's because you already have this list built and stored in somewhere in ~/.sage
, I suppose. Try moving it out of the way 1st.
comment:5 follow-up: ↓ 6 Changed 4 years ago by
Okay, I get the crash after deleting .sage/maxima_commandlist_cache.sobj
. Do you know why starting Sage and then doing maxima.<TAB>
doesn't trigger the crash, but instead rebuilds this file?
comment:6 in reply to: ↑ 5 Changed 4 years ago by
Replying to jhpalmieri:
Okay, I get the crash after deleting
.sage/maxima_commandlist_cache.sobj
. Do you know why starting Sage and then doingmaxima.<TAB>
doesn't trigger the crash, but instead rebuilds this file?
Yes, it is because in this case everything happens in the same thread (the one of the tab completion).
comment:7 follow-up: ↓ 18 Changed 4 years ago by
Making maxima_lib "thread-safe" would consist of *locking* it to one thread. Due to the signal management switching that happens upon entering/exiting ecllib makes it fundamentally incompatible with multi-threading, because signal handlers are process-specific; not thread-specific.
Sage installs special signal handlers (for SIGINT, for instance), and so does ECL. If ECL runs with multi-threading, ECL even goes further with signal handling (it makes a dedicated signal handling thread), and it uses signals to synchronize threads for critical GC operations.
If you want to get ecllib to a state where it can safely be used in a multi-threaded environment, I think one would have to unify the signal management of sage and ecl.
The result would not actually make maxima_lib threadsafe, because maxima itself is rather fundamentally not thread-safe.
comment:8 Changed 4 years ago by
Removing dependencies seems so much more promising, see https://trac.sagemath.org/wiki/symbolics/maxima
comment:9 Changed 4 years ago by
Here is another scenario with two threads, only leading to an abort, not to a segfault.
Here I tab-complete from sage.libs.ecl import
to force initialisation in non-main thread.
sage: from sage.libs.ecl import at init_ecl thread id <Thread(Thread-32, started 139989416191744)> active threads [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>, <Thread(Thread-32, started 139989416191744)>] sage: from sage.libs.ecl import * sage: from sage.interfaces.maxima_lib import * at ecl_eval thread id <_MainThread(MainThread, started 140000333952768)> active threads [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] at ecl_safe_eval thread id <_MainThread(MainThread, started 140000333952768)> active threads [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] at ecl_eval thread id <_MainThread(MainThread, started 140000333952768)> active threads [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] at ecl_safe_eval thread id <_MainThread(MainThread, started 140000333952768)> active threads [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] Collecting from unknown thread --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-2-5e6d4a068396> in <module>() ----> 1 from sage.interfaces.maxima_lib import * /home/dima/Sage/sage-dev/local/lib/python2.7/site-packages/sage/interfaces/maxima_lib.py in <module>() 102 ## i.e. loading it into ECL 103 ecl_eval("(setf *load-verbose* NIL)") --> 104 ecl_eval("(require 'maxima)") 105 ecl_eval("(in-package :maxima)") 106 ecl_eval("(setq $nolabels t))") /home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10977)() 1328 1329 #convenience routine to more easily evaluate strings -> 1330 cpdef EclObject ecl_eval(bytes s): 1331 """ 1332 Read and evaluate string in Lisp and return the result /home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10916)() 1344 cdef cl_object o 1345 o=ecl_safe_read_string(s) -> 1346 o=ecl_safe_eval(o) 1347 return ecl_wrap(o) 1348 /home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_safe_eval (build/cythonized/sage/libs/ecl.c:5710)() 343 report_threading_status("ecl_safe_eval") 344 cdef cl_object s --> 345 ecl_sig_on() 346 cl_funcall(2,safe_eval_clobj,form) 347 ecl_sig_off() RuntimeError: Aborted sage:
with the following prints inserted into ecl.pyx
:
diff --git a/src/sage/libs/ecl.pyx b/src/sage/libs/ecl.pyx index 20e937876d..879d405d78 100644 --- a/src/sage/libs/ecl.pyx +++ b/src/sage/libs/ecl.pyx @@ -240,6 +240,7 @@ def init_ecl(): cdef sigaction_t sage_action[32] cdef int i + report_threading_status("init_ecl") if ecl_has_booted: raise RuntimeError("ECL is already initialized") @@ -339,6 +340,7 @@ cdef cl_object ecl_safe_eval(cl_object form) except NULL: ... RuntimeError: ECL says: Console interrupt. """ + report_threading_status("ecl_safe_eval") cdef cl_object s ecl_sig_on() cl_funcall(2,safe_eval_clobj,form) @@ -1318,6 +1320,12 @@ cdef EclObject ecl_wrap(cl_object o): obj.set_obj(o) return obj +cpdef report_threading_status(s): + import threading + print("\n at ", s) + print("\n thread id ", threading.current_thread(), "\n") + print(" active threads ", threading.enumerate(), "\n") + #convenience routine to more easily evaluate strings cpdef EclObject ecl_eval(bytes s): """ @@ -1332,6 +1340,7 @@ cpdef EclObject ecl_eval(bytes s): <ECL: (1 1 2 3 5 8 13)> """ + report_threading_status("ecl_eval") cdef cl_object o o=ecl_safe_read_string(s) o=ecl_safe_eval(o)
comment:10 Changed 4 years ago by
I think what we see here is ECL being initialised in a thread (number 32) that later is shut down, and then maxima_lib
import breaks, as ECL isn't available to run.
It seems that indeed we must make sure that ECL is always started in the main thread, which does not disappear.
comment:11 Changed 4 years ago by
- Milestone changed from sage-8.0 to sage-8.1
- Priority changed from critical to blocker
A part of the relevant discussion is on #23956, which I'll close as duplicate.
comment:12 Changed 4 years ago by
- Dependencies set to #23956
comment:13 Changed 4 years ago by
- Description modified (diff)
comment:14 Changed 4 years ago by
- Cc jdemeyer added
I wonder whether any other extension (apart from ECL/Maxima) is affected by this issue.
comment:15 follow-up: ↓ 16 Changed 4 years ago by
Reiterating from 23956:
The effect of ecl_sig_on
and ecl_sig_off
is NOT thread-local. Thus during the clock time that ecl_sig_on
is active (i.e., that ecl
code is being executed), signals that are supposed to be handled by the sage signal handler will be handled in the wrong way.
That means it is NOT safe to execute sage code in a thread parallel to a thread that is executing ecl code (properly).
So, if we allow for multiple threads in sage, we'd strictly have to halt all the other threads upon executing ecl_sig_on
, and start them again when the corresponding ecl_sig_off
is entered.
That, or cross your fingers no signals destined for python arrive during that time period.
In addition, we're running ECL with threading support on their end *disabled*. I would be surprised if, with that configuration, it is still possible to have multiple threads configured to be able to execute ECL (ECL cares a lot about knowing which threads might be executing ECL code, because they need to be stopped during critical GC events. I expect that all of that is turned off when threading support is turned off).
Given that IPython apparently runs tab completion in a separate thread, I think the most straightforward way of solving the immediate problem here is to avoid that ecl code will be run upon tab completion. That can be done by building the completion cache upon build time, rather than on-demand.
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 4 years ago by
Replying to nbruin:
The effect of
ecl_sig_on
andecl_sig_off
is NOT thread-local. Thus during the clock time thatecl_sig_on
is active (i.e., thatecl
code is being executed), signals that are supposed to be handled by the sage signal handler will be handled in the wrong way.That means it is NOT safe to execute sage code in a thread parallel to a thread that is executing ecl code (properly).
Right, I think I finally understand your point about signals---sorry for being thick.
It's even worse, I think - apart from signals, ecllib does non-thread-safe things to global variables...
It's known that in such a case GIL does not suffice, you also need a lock from Python threading
lock=threading.Lock() with lock: <do unsafe (non-atomic) stuff here>
That is we potentially might still get hit by many threads here, even if something seemingly innocent happens.
To me it looks that to disable threads in tab completion is a more robust solution, and it will also make sure that other extensions are safe and sound in this respect, not only ECL/Maxima.
comment:17 in reply to: ↑ 16 Changed 4 years ago by
Replying to dimpase:
It's even worse, I think - apart from signals, ecllib does non-thread-safe things to global variables...
After initialization that should pretty much be limited to the modifications that are made to the ECL doubly linked list *SAGE-LIST-OF-OBJECTS*
. The modifications run in ECL whenever an EclObject
is made or deleted (so that should lock, probably). Otherwise I think the signal stuff is the main obstruction to thread-safety.
maximalib is a different issue: maxima is just not thread-safe in its design at all. So I don't think it's worth investing in making ecllib thread-safe (and the signals are a real obstruction), because our main application doesn't allow it anyway.
comment:18 in reply to: ↑ 7 Changed 4 years ago by
Replying to nbruin:
It uses signals to synchronize threads for critical GC operations.
It seems that ecl_sig_on()
changes SIGINT
, SIGBUS
and SIGSEGV
. Does it really use one of those standard signals to deal with GC operations? Because if none of those 3 signals are involved, the issue can't be signal handlers.
It is true that signals and threads generally do not mix well. Signal handlers are set on the level of the process, not threads.
comment:19 Changed 4 years ago by
- Description modified (diff)
comment:20 Changed 4 years ago by
I don't think that signal handling has anything to do with this bug here. I don't see any signals being raised in a strace
dump and also the error message from ECL says "Stack overflow".
comment:21 Changed 4 years ago by
Wait a minute... the "stack overflow" reminds me of a very similar issue that affected PARI/GP: #17773
comment:22 follow-up: ↓ 23 Changed 4 years ago by
Boehm GC (which is used by ECL) installs its own signal handlers in order to be able to scan for garbage. Thus if another thread does something to signals, then GC and thus ECL might go belly up.
"Stack overflow" might be due to GC being given data to work on from another thread it does not know about.
The more I think about it the more inclined I become towards disabling tab completion in a separate thread.
comment:23 in reply to: ↑ 22 ; follow-up: ↓ 24 Changed 4 years ago by
Replying to dimpase:
Boehm GC (which is used by ECL) installs its own signal handlers in order to be able to scan for garbage.
I understand what you are saying but I don't believe that this has anything to do with this ticket.
comment:24 in reply to: ↑ 23 ; follow-up: ↓ 25 Changed 4 years ago by
Replying to jdemeyer:
Replying to dimpase:
Boehm GC (which is used by ECL) installs its own signal handlers in order to be able to scan for garbage.
I understand what you are saying but I don't believe that this has anything to do with this ticket.
As Nils explains, Maxima is not thread-safe, and thus invoking it from non-main thread (e.g. from the tab-completion one) is prone to errors. Thus invoking ECL from non-main thread does not need to be allowed. Assuming this, indeed, signals issue has nothing to do with this ticket, at least if limited to ECL/Maxima scope.
comment:25 in reply to: ↑ 24 Changed 4 years ago by
Right. There are two issues:
- The signal switching that Sage does for ECL is not thread-safe.
- The ECL check for "stack overflow" is broken if run in a different thread.
This ticket is about the second issue. It can be fixed independently.
comment:26 follow-up: ↓ 27 Changed 4 years ago by
It turns out that both issues are actually relevant. After fixing the second issue (in plain IPython, not Sage):
In [1]: import sage.all; from sage.interfaces.maxima_lib import maxima_lib In [2]: maxima_lib. Building Maxima command completion list (this takes a few seconds only the first time you do it). To force rebuild later, delete /home/jdemeyer/.sage//maxima_commandlist_cache.sobj. AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPower failure
The Power failure
is clearly due to an ECL signal. When fixing also this, it still doesn't work:
Building Maxima command completion list (this takes a few seconds only the first time you do it). To force rebuild later, delete /home/jdemeyer/.sage//maxima_commandlist_cache.sobj. AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoCollecting from unknown thread
So it seems that running ECL only in the main thread is the only solution.
comment:27 in reply to: ↑ 26 ; follow-up: ↓ 28 Changed 4 years ago by
Replying to jdemeyer:
So it seems that running ECL only in the main thread is the only solution.
ECL does have facilities for registering/de-registering threads,
ecl_import_current_thread
and ecl_release_current_thread
,
only available if you build it with --enable-threads
, and with somewhat unclear usage rules. Not even sure if they are compatible with our gc version, or whether they would work at all in our setting - I tried with gc-7.6.0 from #23700, it didn't work - perhaps due to the signals trouble you mention?
IMHO making this work seems to be a tough call, and in particular in the upcoming ECL 16.2 this code (and the signals-handling code) is being changed, so what works for 16.1.2 might break in the next version.
comment:28 in reply to: ↑ 27 ; follow-up: ↓ 33 Changed 4 years ago by
Replying to dimpase:
IMHO making this work seems to be a tough call, and in particular in the upcoming ECL 16.2 this code (and the signals-handling code) is being changed, so what works for 16.1.2 might break in the next version.
In that case: perhaps downgrade from blocker and solve later? The issue is a serious one, but the symptoms seem to be easily avoided (it's a rather specific tab completion).
Concerning threading: at least a while ago, enabling threading in ECL meant that ECL would start up a dedicated signal handling thread, and really start using (very strange!) signals to signal GC events to other threads. That setup looked very hard to make compatible with sage. That's why I think we want to stick with ECL *without* threading (and hope they keep supporting that! They really should if they want to keep the "embeddable" a serious option, because in many embedding scenarios having the library take control of signal handling in such an invasive way will be very hard to work with.)
comment:29 Changed 4 years ago by
I was just pointed out at a solved IPython issue I missed, which reverts reliance on prompt_toolkit, and brings back single-threading behaviour of IPython: https://github.com/ipython/ipython/issues/10364/#issuecomment-300829008
There is another reason for avoiding prompt_toolkit - it does multi-threaded importing of Python modules, and given how fragile Sage is in its dependencies handling, this is something to avoid, unless we want more mysterious crashes to happen.
comment:30 follow-up: ↓ 31 Changed 4 years ago by
By the way, FriCAS is another (soon to be optional (see #23847), currently experimental) Sage package dependent on ECL in a substantial way.
comment:31 in reply to: ↑ 30 Changed 4 years ago by
Replying to dimpase:
By the way, FriCAS is another (soon to be optional (see #23847), currently experimental) Sage package dependent on ECL in a substantial way.
That shouldn't interact with the issue here at all, as long as FriCas? runs via a proper expect interface. If people start running FriCas? in ecllib, I expect bigger trouble, because I don't expect that one can run maxima and fricas in the same lisp without special measures -- both are legacy applications that were originally designed to have the world (or at least their process) to themselves.
Getting rid of multi-threading in IPython sounds like a very good idea.
comment:32 Changed 4 years ago by
Here is one way to try the old new IPython prompt---this gets rid of crashes for me.
This is only IPython hack - I don't know how to force Sage's IPython switch to this.
One can do the following (I also removed ~/.sage/
for a good measure, not sure if this is needed; also not sure if it really needs 5.5.0, it also seems to work with the IPython 5.0 that we ship):
$ ./sage --pip install git+https://github.com/ipython/ipython/@5.5.0 $ ./sage --pip install rlipython $ ./sage --ipython
once at IPython prompt, type
import rlipython; rlipython.install()
and quit. Then IPython will use readline for completion, as in good old days of version 4.x.
To test that this fixes the bug: start IPython as ./sage --ipython
, and run
from sage.all import * from sage.interfaces.maxima_lib import maxima_lib maxima_lib.
(notice the 1st import---needed to initialise Sage).
comment:33 in reply to: ↑ 28 ; follow-up: ↓ 34 Changed 4 years ago by
Replying to nbruin:
Replying to dimpase:
IMHO making this work seems to be a tough call, and in particular in the upcoming ECL 16.2 this code (and the signals-handling code) is being changed, so what works for 16.1.2 might break in the next version.
In that case: perhaps downgrade from blocker and solve later? The issue is a serious one, but the symptoms seem to be easily avoided (it's a rather specific tab completion).
Concerning threading: at least a while ago, enabling threading in ECL meant that ECL would start up a dedicated signal handling thread, and really start using (very strange!) signals to signal GC events to other threads. That setup looked very hard to make compatible with sage. That's why I think we want to stick with ECL *without* threading (and hope they keep supporting that! They really should if they want to keep the "embeddable" a serious option, because in many embedding scenarios having the library take control of signal handling in such an invasive way will be very hard to work with.)
One thing I've been investigating--the relevance of which I'm not sure--is that we compile libgc with threading support but ECL without. The implications of this are complicated enough that I don't fully understand yet, but it makes me wonder if this can lead to bugs (this is possibly related to #23973).
comment:34 in reply to: ↑ 33 Changed 4 years ago by
Replying to embray:
Replying to nbruin:
Replying to dimpase:
...
Concerning threading: at least a while ago, enabling threading in ECL meant that ECL would start up a dedicated signal handling thread, and really start using (very strange!) signals to signal GC events to other threads. That setup looked very hard to make compatible with sage.
IMHO one takes care of this in ecl.pyx
:
ecl_set_option(ECL_OPT_SIGNAL_HANDLING_THREAD, 0) cl_boot(1, argv)
making sure that signals are not handled in a separate thread, no?
That's why I think we want to stick with ECL *without* threading (and hope they keep supporting that! They really should if they want to keep the "embeddable" a serious option, because in many embedding scenarios having the library take control of signal handling in such an invasive way will be very hard to work with.)
One thing I've been investigating--the relevance of which I'm not sure--is that we compile libgc with threading support but ECL without. The implications of this are complicated enough that I don't fully understand yet, but it makes me wonder if this can lead to bugs (this is possibly related to #23973).
Mind you, I came to this ticket via #23956 via #22679; on the latter I had a lot of trouble with threads (yes, in docbuilding with -jx
, x>1
too), until finding out that GC folks have not supplied a complete multithreading interface for FreeBSD---now fixed in https://github.com/ivmai/bdwgc/issues/180
and only then realising that some multithreading-related segfaults happen on Linux too :-)
I am not sure what "multithreading for GC" really means; it can be any combination of 2 things:
1) GC using threads to speed itself up
2) GC properly handles the situation of being initialised/called from a multithreaded application.
IMHO 1) is disabled by --disable-parallel-mark
(in perhaps more recent that 7.2f versions...).
comment:35 Changed 4 years ago by
- Priority changed from blocker to critical
I don't think that this should be a blocker issue. It's an annoying bug which crashes Sage, but it's not very likely to appear since the TAB completion is cashed.
comment:36 Changed 4 years ago by
I already mentioned in comment 9 above that even with the TAB cache present, one can get an annoying runtime error; to replicate,
sage: from sage.libs.ecl import <TAB>
and choose something from the list that pops up, then the following input leads to a runtime error.
sage: from sage.interfaces.maxima_lib import * Collecting from unknown thread --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-2-5e6d4a068396> in <module>() ----> 1 from sage.interfaces.maxima_lib import * /home/dima/Sage/sage-dev/local/lib/python2.7/site-packages/sage/interfaces/maxima_lib.py in <module>() 102 ## i.e. loading it into ECL 103 ecl_eval("(setf *load-verbose* NIL)") --> 104 ecl_eval("(require 'maxima)") 105 ecl_eval("(in-package :maxima)") 106 ecl_eval("(setq $nolabels t))") /home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10787)() 1320 1321 #convenience routine to more easily evaluate strings -> 1322 cpdef EclObject ecl_eval(bytes s): 1323 """ 1324 Read and evaluate string in Lisp and return the result /home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10726)() 1335 cdef cl_object o 1336 o=ecl_safe_read_string(s) -> 1337 o=ecl_safe_eval(o) 1338 return ecl_wrap(o) 1339 /home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_safe_eval (build/cythonized/sage/libs/ecl.c:5716)() 341 """ 342 cdef cl_object s --> 343 ecl_sig_on() 344 cl_funcall(2,safe_eval_clobj,form) 345 ecl_sig_off() RuntimeError: Aborted
On the positive side, ipython folks are apparently going to disable tab completion in a separate thread, once a version of prompt_toolkit
with the relevant option is released.
comment:37 follow-up: ↓ 38 Changed 4 years ago by
It appears that this has nuked GAP pexpect interface, too:
hitting Tab at
sage: gap.
leads to
Warning: this should never happen
printed ABOVE the line
sage: gap.
and Sage becomes irresponsive and has to be killed. (this is with Sage 8.2.beta5)
comment:38 in reply to: ↑ 37 Changed 4 years ago by
Replying to dimpase:
It appears that this has nuked GAP pexpect interface, too: hitting Tab at
sage: gap.
leads toWarning: this should never happen
printed ABOVE the linesage: gap.and Sage becomes irresponsive and has to be killed. (this is with Sage 8.2.beta5)
This doesn't happen to me with Sage 8.2.beta6, or maybe I don't understand the necessary steps. If I run Sage and then immediately run "gap.<TAB>", it works fine (OS X 10.13.3). Am I missing some aspect of triggering this?
comment:39 follow-up: ↓ 41 Changed 4 years ago by
Confirmed that gap.
works on OpenSuSE with beta6. But maxima_lib.
as in the ticket description still crashes.
comment:40 Changed 4 years ago by
Moreover if I rm -f ~/.sage/giac_commandlist_cache.sobj
then giac.<TAB>
does not crash. Probably the peculiar properties of ECL make maxima a special case.
comment:41 in reply to: ↑ 39 Changed 4 years ago by
Replying to rws:
Confirmed that
gap.
works on OpenSuSE with beta6. Butmaxima_lib.
as in the ticket description still crashes.
Sorry, it appears that in case of gap.
I have been barking up the wrong tree.
comment:42 Changed 10 months ago by
- Milestone changed from sage-8.1 to sage-duplicate/invalid/wontfix
- Status changed from new to needs_review
all is good in Sage 9.3.rc1
comment:43 Changed 10 months ago by
- Reviewers set to Dima Pasechnik
- Status changed from needs_review to positive_review
comment:44 Changed 7 months ago by
- Resolution set to invalid
- Status changed from positive_review to closed
I think the crash happens in line
281 in /home/ralf/sage/local/var/tmp/sage/build/ecl-16.1.2.p2/src/src/c/interpreter.d
.