Opened 5 years ago

Closed 7 months ago

#22766 closed defect (invalid)

Trying completion list of maxima_lib crashes Sage

Reported by: rws Owned by:
Priority: critical Milestone: sage-duplicate/invalid/wontfix
Component: interfaces Keywords:
Cc: jdemeyer Merged in:
Authors: Reviewers: Dima Pasechnik
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #23956 Stopgaps:

Status badges

Description (last modified by jdemeyer)

If Maxima's commands list is not stored, then initialising Maxima/ECL and then hitting TAB after maxima_lib crashes Sage, as shown below. Other similar crashes may be triggered, see e.g. #23956.

The reason for these crashes is the design of tab completion in IPython 5+ using prompt_toolkit, which uses Python threading, and does tab completion in a separate thread.

$ rm -f ~/.sage/maxima_commandlist_cache.sobj
$ sage
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 8.0.beta0, Release Date: 2017-03-30               │
│ Type "notebook()" for the browser-based notebook interface.        │
│ Type "help()" for help.                                            │
└────────────────────────────────────────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Warning: this is a prerelease version, and it may be unstable.     ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
sage: from sage.interfaces.maxima_lib import maxima_lib
sage: maxima_lib.
Building Maxima command completion list (this takes
a few seconds only the first time you do it).
To force rebuild later, delete /home/ralf/.sage//maxima_commandlist_cache.sobj.
A
;;;
;;; Stack overflow.
;;; Jumping to the outermost toplevel prompt
;;;


Internal or unrecoverable error in:

;;;
;;; No frame to jump to
;;; Aborting ECL
;;;

;;; ECL C Backtrace
;;; /home/ralf/sage/local/lib/libecl.so.16.1(si_dump_c_backtrace+0x26) [0x7f012]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_internal_error+0x3f) [0x7f0127]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(FEerror+0) [0x7f0127706f20]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(+0x1adb1a) [0x7f012772eb1a]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(+0x12b3c5) [0x7f01276ac3c5]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(cl_funcall+0x70) [0x7f01276e7c90]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(si_serror+0xd9) [0x7f0127708299]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_cs_overflow+0xac) [0x7f012772e]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(+0x12b3c5) [0x7f01276ac3c5]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(cl_funcall+0x70) [0x7f01276e7c90]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(si_serror+0xd9) [0x7f0127708299]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_cs_overflow+0xac) [0x7f012772e]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(ecl_interpret+0x1d67) [0x7f01276ea]
;;; /home/ralf/sage/local/lib/libecl.so.16.1(cl_apply+0x145) [0x7f01276e7e95]
;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0xcf1d)]
;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0x15511]
;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0x15d6b]
;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0x16d47]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(+0xc0c7f) [0x7f038ea92c7f]
;;; /home/ralf/sage/local/lib/python2.7/site-packages/sage/libs/ecl.so(+0xcad8)]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyObject_Call+0x43) [0x7f038e]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x56da) [0]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8020) [0]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x81c) [0x7]
;;; /home/ralf/sage/local/lib/libpython2.7.so.1.0(+0x87ecc) [0x7f038ea59ecc]
Aborted (core dumped)

Change History (44)

comment:1 Changed 5 years ago by rws

I think the crash happens in line 281 in /home/ralf/sage/local/var/tmp/sage/build/ecl-16.1.2.p2/src/src/c/interpreter.d.

comment:2 Changed 4 years ago by dimpase

It is because since IPython 5.*, tab completion happens in a different thread. And here you initialise Maxima in the main thread, but run Maxima (via triggering tab completion) in a separate thread.

See also #23700 and #23956 for a different way to trigger the same bug.

comment:3 follow-up: Changed 4 years ago by jhpalmieri

On OS X (10.12.6, Xcode 8.3.3, Sage 8.1.beta6), I don't get this crash, and I also don't see the message "Building Maxima command completion list ...": I just immediately get a list of completions.

comment:4 in reply to: ↑ 3 Changed 4 years ago by dimpase

Replying to jhpalmieri:

On OS X (10.12.6, Xcode 8.3.3, Sage 8.1.beta6), I don't get this crash, and I also don't see the message "Building Maxima command completion list ...": I just immediately get a list of completions.

It's because you already have this list built and stored in somewhere in ~/.sage, I suppose. Try moving it out of the way 1st.

comment:5 follow-up: Changed 4 years ago by jhpalmieri

Okay, I get the crash after deleting .sage/maxima_commandlist_cache.sobj. Do you know why starting Sage and then doing maxima.<TAB> doesn't trigger the crash, but instead rebuilds this file?

comment:6 in reply to: ↑ 5 Changed 4 years ago by dimpase

Replying to jhpalmieri:

Okay, I get the crash after deleting .sage/maxima_commandlist_cache.sobj. Do you know why starting Sage and then doing maxima.<TAB> doesn't trigger the crash, but instead rebuilds this file?

Yes, it is because in this case everything happens in the same thread (the one of the tab completion).

comment:7 follow-up: Changed 4 years ago by nbruin

Making maxima_lib "thread-safe" would consist of *locking* it to one thread. Due to the signal management switching that happens upon entering/exiting ecllib makes it fundamentally incompatible with multi-threading, because signal handlers are process-specific; not thread-specific.

Sage installs special signal handlers (for SIGINT, for instance), and so does ECL. If ECL runs with multi-threading, ECL even goes further with signal handling (it makes a dedicated signal handling thread), and it uses signals to synchronize threads for critical GC operations.

If you want to get ecllib to a state where it can safely be used in a multi-threaded environment, I think one would have to unify the signal management of sage and ecl.

The result would not actually make maxima_lib threadsafe, because maxima itself is rather fundamentally not thread-safe.

comment:8 Changed 4 years ago by rws

Removing dependencies seems so much more promising, see https://trac.sagemath.org/wiki/symbolics/maxima

comment:9 Changed 4 years ago by dimpase

Here is another scenario with two threads, only leading to an abort, not to a segfault. Here I tab-complete from sage.libs.ecl import to force initialisation in non-main thread.

sage: from sage.libs.ecl import                        
 at  init_ecl
 thread id  <Thread(Thread-32, started 139989416191744)> 
 active threads  [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>, <Thread(Thread-32, started 139989416191744)>] 
sage: from sage.libs.ecl import *
sage: from sage.interfaces.maxima_lib import *
 at  ecl_eval
 thread id  <_MainThread(MainThread, started 140000333952768)> 
 active threads  [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] 
 at  ecl_safe_eval
 thread id  <_MainThread(MainThread, started 140000333952768)> 
 active threads  [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] 
 at  ecl_eval
 thread id  <_MainThread(MainThread, started 140000333952768)> 
 active threads  [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] 
 at  ecl_safe_eval
 thread id  <_MainThread(MainThread, started 140000333952768)> 
 active threads  [<_MainThread(MainThread, started 140000333952768)>, <HistorySavingThread(IPythonHistorySavingThread, started 140000088241920)>] 
Collecting from unknown thread
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-5e6d4a068396> in <module>()
----> 1 from sage.interfaces.maxima_lib import *

/home/dima/Sage/sage-dev/local/lib/python2.7/site-packages/sage/interfaces/maxima_lib.py in <module>()
    102 ## i.e. loading it into ECL
    103 ecl_eval("(setf *load-verbose* NIL)")
--> 104 ecl_eval("(require 'maxima)")
    105 ecl_eval("(in-package :maxima)")
    106 ecl_eval("(setq $nolabels t))")

/home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10977)()
   1328 
   1329 #convenience routine to more easily evaluate strings
-> 1330 cpdef EclObject ecl_eval(bytes s):
   1331     """
   1332     Read and evaluate string in Lisp and return the result

/home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10916)()
   1344     cdef cl_object o
   1345     o=ecl_safe_read_string(s)
-> 1346     o=ecl_safe_eval(o)
   1347     return ecl_wrap(o)
   1348 

/home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_safe_eval (build/cythonized/sage/libs/ecl.c:5710)()
    343     report_threading_status("ecl_safe_eval")
    344     cdef cl_object s
--> 345     ecl_sig_on()
    346     cl_funcall(2,safe_eval_clobj,form)
    347     ecl_sig_off()

RuntimeError: Aborted
sage: 

with the following prints inserted into ecl.pyx:

diff --git a/src/sage/libs/ecl.pyx b/src/sage/libs/ecl.pyx
index 20e937876d..879d405d78 100644
--- a/src/sage/libs/ecl.pyx
+++ b/src/sage/libs/ecl.pyx
@@ -240,6 +240,7 @@ def init_ecl():
     cdef sigaction_t sage_action[32]
     cdef int i
 
+    report_threading_status("init_ecl")
     if ecl_has_booted:
         raise RuntimeError("ECL is already initialized")
 
@@ -339,6 +340,7 @@ cdef cl_object ecl_safe_eval(cl_object form) except NULL:
         ...
         RuntimeError: ECL says: Console interrupt.
     """
+    report_threading_status("ecl_safe_eval")
     cdef cl_object s
     ecl_sig_on()
     cl_funcall(2,safe_eval_clobj,form)
@@ -1318,6 +1320,12 @@ cdef EclObject ecl_wrap(cl_object o):
     obj.set_obj(o)
     return obj
 
+cpdef report_threading_status(s):
+    import threading
+    print("\n at ", s)
+    print("\n thread id ", threading.current_thread(), "\n")
+    print(" active threads ", threading.enumerate(), "\n")
+
 #convenience routine to more easily evaluate strings
 cpdef EclObject ecl_eval(bytes s):
     """
@@ -1332,6 +1340,7 @@ cpdef EclObject ecl_eval(bytes s):
         <ECL: (1 1 2 3 5 8 13)>
 
     """
+    report_threading_status("ecl_eval")
     cdef cl_object o
     o=ecl_safe_read_string(s)
     o=ecl_safe_eval(o)

comment:10 Changed 4 years ago by dimpase

I think what we see here is ECL being initialised in a thread (number 32) that later is shut down, and then maxima_lib import breaks, as ECL isn't available to run.

It seems that indeed we must make sure that ECL is always started in the main thread, which does not disappear.

comment:11 Changed 4 years ago by dimpase

  • Milestone changed from sage-8.0 to sage-8.1
  • Priority changed from critical to blocker

A part of the relevant discussion is on #23956, which I'll close as duplicate.

comment:12 Changed 4 years ago by dimpase

  • Dependencies set to #23956

comment:13 Changed 4 years ago by dimpase

  • Description modified (diff)

comment:14 Changed 4 years ago by dimpase

  • Cc jdemeyer added

I wonder whether any other extension (apart from ECL/Maxima) is affected by this issue.

comment:15 follow-up: Changed 4 years ago by nbruin

Reiterating from 23956:

The effect of ecl_sig_on and ecl_sig_off is NOT thread-local. Thus during the clock time that ecl_sig_on is active (i.e., that ecl code is being executed), signals that are supposed to be handled by the sage signal handler will be handled in the wrong way.

That means it is NOT safe to execute sage code in a thread parallel to a thread that is executing ecl code (properly).

So, if we allow for multiple threads in sage, we'd strictly have to halt all the other threads upon executing ecl_sig_on, and start them again when the corresponding ecl_sig_off is entered. That, or cross your fingers no signals destined for python arrive during that time period.

In addition, we're running ECL with threading support on their end *disabled*. I would be surprised if, with that configuration, it is still possible to have multiple threads configured to be able to execute ECL (ECL cares a lot about knowing which threads might be executing ECL code, because they need to be stopped during critical GC events. I expect that all of that is turned off when threading support is turned off).

Given that IPython apparently runs tab completion in a separate thread, I think the most straightforward way of solving the immediate problem here is to avoid that ecl code will be run upon tab completion. That can be done by building the completion cache upon build time, rather than on-demand.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 4 years ago by dimpase

Replying to nbruin:

The effect of ecl_sig_on and ecl_sig_off is NOT thread-local. Thus during the clock time that ecl_sig_on is active (i.e., that ecl code is being executed), signals that are supposed to be handled by the sage signal handler will be handled in the wrong way.

That means it is NOT safe to execute sage code in a thread parallel to a thread that is executing ecl code (properly).

Right, I think I finally understand your point about signals---sorry for being thick.

It's even worse, I think - apart from signals, ecllib does non-thread-safe things to global variables... It's known that in such a case GIL does not suffice, you also need a lock from Python threading

lock=threading.Lock()
with lock:
    <do unsafe (non-atomic) stuff here>

That is we potentially might still get hit by many threads here, even if something seemingly innocent happens.

To me it looks that to disable threads in tab completion is a more robust solution, and it will also make sure that other extensions are safe and sound in this respect, not only ECL/Maxima.

comment:17 in reply to: ↑ 16 Changed 4 years ago by nbruin

Replying to dimpase:

It's even worse, I think - apart from signals, ecllib does non-thread-safe things to global variables...

After initialization that should pretty much be limited to the modifications that are made to the ECL doubly linked list *SAGE-LIST-OF-OBJECTS*. The modifications run in ECL whenever an EclObject is made or deleted (so that should lock, probably). Otherwise I think the signal stuff is the main obstruction to thread-safety.

maximalib is a different issue: maxima is just not thread-safe in its design at all. So I don't think it's worth investing in making ecllib thread-safe (and the signals are a real obstruction), because our main application doesn't allow it anyway.

comment:18 in reply to: ↑ 7 Changed 4 years ago by jdemeyer

Replying to nbruin:

It uses signals to synchronize threads for critical GC operations.

It seems that ecl_sig_on() changes SIGINT, SIGBUS and SIGSEGV. Does it really use one of those standard signals to deal with GC operations? Because if none of those 3 signals are involved, the issue can't be signal handlers.

It is true that signals and threads generally do not mix well. Signal handlers are set on the level of the process, not threads.

comment:19 Changed 4 years ago by jdemeyer

  • Description modified (diff)

comment:20 Changed 4 years ago by jdemeyer

I don't think that signal handling has anything to do with this bug here. I don't see any signals being raised in a strace dump and also the error message from ECL says "Stack overflow".

comment:21 Changed 4 years ago by jdemeyer

Wait a minute... the "stack overflow" reminds me of a very similar issue that affected PARI/GP: #17773

comment:22 follow-up: Changed 4 years ago by dimpase

Boehm GC (which is used by ECL) installs its own signal handlers in order to be able to scan for garbage. Thus if another thread does something to signals, then GC and thus ECL might go belly up.

"Stack overflow" might be due to GC being given data to work on from another thread it does not know about.

The more I think about it the more inclined I become towards disabling tab completion in a separate thread.

comment:23 in reply to: ↑ 22 ; follow-up: Changed 4 years ago by jdemeyer

Replying to dimpase:

Boehm GC (which is used by ECL) installs its own signal handlers in order to be able to scan for garbage.

I understand what you are saying but I don't believe that this has anything to do with this ticket.

comment:24 in reply to: ↑ 23 ; follow-up: Changed 4 years ago by dimpase

Replying to jdemeyer:

Replying to dimpase:

Boehm GC (which is used by ECL) installs its own signal handlers in order to be able to scan for garbage.

I understand what you are saying but I don't believe that this has anything to do with this ticket.

As Nils explains, Maxima is not thread-safe, and thus invoking it from non-main thread (e.g. from the tab-completion one) is prone to errors. Thus invoking ECL from non-main thread does not need to be allowed. Assuming this, indeed, signals issue has nothing to do with this ticket, at least if limited to ECL/Maxima scope.

comment:25 in reply to: ↑ 24 Changed 4 years ago by jdemeyer

Right. There are two issues:

  1. The signal switching that Sage does for ECL is not thread-safe.
  1. The ECL check for "stack overflow" is broken if run in a different thread.

This ticket is about the second issue. It can be fixed independently.

comment:26 follow-up: Changed 4 years ago by jdemeyer

It turns out that both issues are actually relevant. After fixing the second issue (in plain IPython, not Sage):

In [1]: import sage.all; from sage.interfaces.maxima_lib import maxima_lib

In [2]: maxima_lib.
Building Maxima command completion list (this takes
a few seconds only the first time you do it).
To force rebuild later, delete /home/jdemeyer/.sage//maxima_commandlist_cache.sobj.
AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPower failure

The Power failure is clearly due to an ECL signal. When fixing also this, it still doesn't work:

Building Maxima command completion list (this takes
a few seconds only the first time you do it).
To force rebuild later, delete /home/jdemeyer/.sage//maxima_commandlist_cache.sobj.
AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoCollecting from unknown thread

So it seems that running ECL only in the main thread is the only solution.

comment:27 in reply to: ↑ 26 ; follow-up: Changed 4 years ago by dimpase

Replying to jdemeyer:

So it seems that running ECL only in the main thread is the only solution.

ECL does have facilities for registering/de-registering threads, ecl_import_current_thread and ecl_release_current_thread, only available if you build it with --enable-threads, and with somewhat unclear usage rules. Not even sure if they are compatible with our gc version, or whether they would work at all in our setting - I tried with gc-7.6.0 from #23700, it didn't work - perhaps due to the signals trouble you mention?

IMHO making this work seems to be a tough call, and in particular in the upcoming ECL 16.2 this code (and the signals-handling code) is being changed, so what works for 16.1.2 might break in the next version.

comment:28 in reply to: ↑ 27 ; follow-up: Changed 4 years ago by nbruin

Replying to dimpase:

IMHO making this work seems to be a tough call, and in particular in the upcoming ECL 16.2 this code (and the signals-handling code) is being changed, so what works for 16.1.2 might break in the next version.

In that case: perhaps downgrade from blocker and solve later? The issue is a serious one, but the symptoms seem to be easily avoided (it's a rather specific tab completion).

Concerning threading: at least a while ago, enabling threading in ECL meant that ECL would start up a dedicated signal handling thread, and really start using (very strange!) signals to signal GC events to other threads. That setup looked very hard to make compatible with sage. That's why I think we want to stick with ECL *without* threading (and hope they keep supporting that! They really should if they want to keep the "embeddable" a serious option, because in many embedding scenarios having the library take control of signal handling in such an invasive way will be very hard to work with.)

comment:29 Changed 4 years ago by dimpase

I was just pointed out at a solved IPython issue I missed, which reverts reliance on prompt_toolkit, and brings back single-threading behaviour of IPython: https://github.com/ipython/ipython/issues/10364/#issuecomment-300829008

There is another reason for avoiding prompt_toolkit - it does multi-threaded importing of Python modules, and given how fragile Sage is in its dependencies handling, this is something to avoid, unless we want more mysterious crashes to happen.

comment:30 follow-up: Changed 4 years ago by dimpase

By the way, FriCAS is another (soon to be optional (see #23847), currently experimental) Sage package dependent on ECL in a substantial way.

comment:31 in reply to: ↑ 30 Changed 4 years ago by nbruin

Replying to dimpase:

By the way, FriCAS is another (soon to be optional (see #23847), currently experimental) Sage package dependent on ECL in a substantial way.

That shouldn't interact with the issue here at all, as long as FriCas? runs via a proper expect interface. If people start running FriCas? in ecllib, I expect bigger trouble, because I don't expect that one can run maxima and fricas in the same lisp without special measures -- both are legacy applications that were originally designed to have the world (or at least their process) to themselves.

Getting rid of multi-threading in IPython sounds like a very good idea.

comment:32 Changed 4 years ago by dimpase

Here is one way to try the old new IPython prompt---this gets rid of crashes for me. This is only IPython hack - I don't know how to force Sage's IPython switch to this. One can do the following (I also removed ~/.sage/ for a good measure, not sure if this is needed; also not sure if it really needs 5.5.0, it also seems to work with the IPython 5.0 that we ship):

$ ./sage --pip install git+https://github.com/ipython/ipython/@5.5.0
$ ./sage --pip install rlipython
$ ./sage --ipython

once at IPython prompt, type

import rlipython; rlipython.install()

and quit. Then IPython will use readline for completion, as in good old days of version 4.x. To test that this fixes the bug: start IPython as ./sage --ipython, and run

from sage.all import *
from sage.interfaces.maxima_lib import maxima_lib
maxima_lib.

(notice the 1st import---needed to initialise Sage).

Last edited 4 years ago by dimpase (previous) (diff)

comment:33 in reply to: ↑ 28 ; follow-up: Changed 4 years ago by embray

Replying to nbruin:

Replying to dimpase:

IMHO making this work seems to be a tough call, and in particular in the upcoming ECL 16.2 this code (and the signals-handling code) is being changed, so what works for 16.1.2 might break in the next version.

In that case: perhaps downgrade from blocker and solve later? The issue is a serious one, but the symptoms seem to be easily avoided (it's a rather specific tab completion).

Concerning threading: at least a while ago, enabling threading in ECL meant that ECL would start up a dedicated signal handling thread, and really start using (very strange!) signals to signal GC events to other threads. That setup looked very hard to make compatible with sage. That's why I think we want to stick with ECL *without* threading (and hope they keep supporting that! They really should if they want to keep the "embeddable" a serious option, because in many embedding scenarios having the library take control of signal handling in such an invasive way will be very hard to work with.)

One thing I've been investigating--the relevance of which I'm not sure--is that we compile libgc with threading support but ECL without. The implications of this are complicated enough that I don't fully understand yet, but it makes me wonder if this can lead to bugs (this is possibly related to #23973).

comment:34 in reply to: ↑ 33 Changed 4 years ago by dimpase

Replying to embray:

Replying to nbruin:

Replying to dimpase:

...

Concerning threading: at least a while ago, enabling threading in ECL meant that ECL would start up a dedicated signal handling thread, and really start using (very strange!) signals to signal GC events to other threads. That setup looked very hard to make compatible with sage.

IMHO one takes care of this in ecl.pyx:

    ecl_set_option(ECL_OPT_SIGNAL_HANDLING_THREAD, 0)
    cl_boot(1, argv)

making sure that signals are not handled in a separate thread, no?

That's why I think we want to stick with ECL *without* threading (and hope they keep supporting that! They really should if they want to keep the "embeddable" a serious option, because in many embedding scenarios having the library take control of signal handling in such an invasive way will be very hard to work with.)

One thing I've been investigating--the relevance of which I'm not sure--is that we compile libgc with threading support but ECL without. The implications of this are complicated enough that I don't fully understand yet, but it makes me wonder if this can lead to bugs (this is possibly related to #23973).

Mind you, I came to this ticket via #23956 via #22679; on the latter I had a lot of trouble with threads (yes, in docbuilding with -jx, x>1 too), until finding out that GC folks have not supplied a complete multithreading interface for FreeBSD---now fixed in https://github.com/ivmai/bdwgc/issues/180 and only then realising that some multithreading-related segfaults happen on Linux too :-) I am not sure what "multithreading for GC" really means; it can be any combination of 2 things:

1) GC using threads to speed itself up

2) GC properly handles the situation of being initialised/called from a multithreaded application.

IMHO 1) is disabled by --disable-parallel-mark (in perhaps more recent that 7.2f versions...).

comment:35 Changed 4 years ago by jdemeyer

  • Priority changed from blocker to critical

I don't think that this should be a blocker issue. It's an annoying bug which crashes Sage, but it's not very likely to appear since the TAB completion is cashed.

comment:36 Changed 4 years ago by dimpase

I already mentioned in comment 9 above that even with the TAB cache present, one can get an annoying runtime error; to replicate,

sage: from sage.libs.ecl import <TAB>

and choose something from the list that pops up, then the following input leads to a runtime error.

sage: from sage.interfaces.maxima_lib import *
Collecting from unknown thread
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-5e6d4a068396> in <module>()
----> 1 from sage.interfaces.maxima_lib import *

/home/dima/Sage/sage-dev/local/lib/python2.7/site-packages/sage/interfaces/maxima_lib.py in <module>()
    102 ## i.e. loading it into ECL
    103 ecl_eval("(setf *load-verbose* NIL)")
--> 104 ecl_eval("(require 'maxima)")
    105 ecl_eval("(in-package :maxima)")
    106 ecl_eval("(setq $nolabels t))")

/home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10787)()
   1320 
   1321 #convenience routine to more easily evaluate strings
-> 1322 cpdef EclObject ecl_eval(bytes s):
   1323     """
   1324     Read and evaluate string in Lisp and return the result

/home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_eval (build/cythonized/sage/libs/ecl.c:10726)()
   1335     cdef cl_object o
   1336     o=ecl_safe_read_string(s)
-> 1337     o=ecl_safe_eval(o)
   1338     return ecl_wrap(o)
   1339 

/home/dima/Sage/sage-dev/src/sage/libs/ecl.pyx in sage.libs.ecl.ecl_safe_eval (build/cythonized/sage/libs/ecl.c:5716)()
    341     """
    342     cdef cl_object s
--> 343     ecl_sig_on()
    344     cl_funcall(2,safe_eval_clobj,form)
    345     ecl_sig_off()

RuntimeError: Aborted

On the positive side, ipython folks are apparently going to disable tab completion in a separate thread, once a version of prompt_toolkit with the relevant option is released.

comment:37 follow-up: Changed 4 years ago by dimpase

It appears that this has nuked GAP pexpect interface, too: hitting Tab at sage: gap. leads to Warning: this should never happen printed ABOVE the line

sage: gap.

and Sage becomes irresponsive and has to be killed. (this is with Sage 8.2.beta5)

Last edited 4 years ago by dimpase (previous) (diff)

comment:38 in reply to: ↑ 37 Changed 4 years ago by jhpalmieri

Replying to dimpase:

It appears that this has nuked GAP pexpect interface, too: hitting Tab at sage: gap. leads to Warning: this should never happen printed ABOVE the line

sage: gap.

and Sage becomes irresponsive and has to be killed. (this is with Sage 8.2.beta5)

This doesn't happen to me with Sage 8.2.beta6, or maybe I don't understand the necessary steps. If I run Sage and then immediately run "gap.<TAB>", it works fine (OS X 10.13.3). Am I missing some aspect of triggering this?

comment:39 follow-up: Changed 4 years ago by rws

Confirmed that gap. works on OpenSuSE with beta6. But maxima_lib. as in the ticket description still crashes.

comment:40 Changed 4 years ago by rws

Moreover if I rm -f ~/.sage/giac_commandlist_cache.sobj then giac.<TAB> does not crash. Probably the peculiar properties of ECL make maxima a special case.

comment:41 in reply to: ↑ 39 Changed 4 years ago by dimpase

Replying to rws:

Confirmed that gap. works on OpenSuSE with beta6. But maxima_lib. as in the ticket description still crashes.

Sorry, it appears that in case of gap. I have been barking up the wrong tree.

comment:42 Changed 10 months ago by dimpase

  • Milestone changed from sage-8.1 to sage-duplicate/invalid/wontfix
  • Status changed from new to needs_review

all is good in Sage 9.3.rc1

comment:43 Changed 10 months ago by dimpase

  • Reviewers set to Dima Pasechnik
  • Status changed from needs_review to positive_review

comment:44 Changed 7 months ago by mkoeppe

  • Resolution set to invalid
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.