Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#25092 closed defect (fixed)

sage --gdb does not start due to SIGFPE

Reported by: jdemeyer Owned by:
Priority: blocker Milestone: sage-8.2
Component: user interface Keywords:
Cc: pbruin Merged in:
Authors: Jeroen Demeyer Reviewers: Peter Bruin
Report Upstream: N/A Work issues:
Branch: 8b3a7c5 (Commits) Commit:
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

Since the cysignals upgrade, running sage --gdb now gives

Program received signal SIGFPE, Arithmetic exception.

This is bad for several reasons:

  1. Even though that SIGFPE is part of the normal flow of the program, people don't expect it and consider it a crash.
  1. GDB halts at that point so Sage doesn't actually start.
  1. Because of 2, it causes optional doctest failures when GDB is installed.

The newest release of cysignals avoids this problem by using a different mechanism to start up:

Tarball: https://pypi.python.org/packages/a5/c0/f07fbf4b4c5e6e77c8f1153a43a032f6cefb5c49d4180228fec051ee480b/cysignals-1.7.0.tar.gz

Change History (20)

comment:1 Changed 2 years ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Description modified (diff)

comment:2 Changed 2 years ago by jdemeyer

  • Branch set to u/jdemeyer/sage___gdb_does_not_start_due_to_sigfpe

comment:3 Changed 2 years ago by jdemeyer

  • Commit set to 586cc2f4b8903d539de0e95b6a36c74924c29044
  • Status changed from new to needs_review

New commits:

2ef6c3aUse sdh_configure in cysignals
586cc2fUpgrade cysignals to version 1.7.0

comment:4 Changed 2 years ago by pbruin

  • Reviewers set to Peter Bruin
  • Status changed from needs_review to positive_review

Looks good, tests pass and sage --gdb now starts again without problems. The segmentation fault caused by pari('f(x)=f(x)')(0) (see #25028) is caught by GDB and (after the continue command in GDB) handled by cysignals as it should be.

comment:5 Changed 2 years ago by embray

Ok, LGTM.

comment:6 Changed 2 years ago by vbraun

  • Status changed from positive_review to needs_work

Doesn't work on Debian 7 32-bit (with Sage's own gcc):

(sage-sh) buildbot@sagebd07_32s02:build$ gdb python
GNU gdb (GDB) 7.8
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...done.
(gdb) run
Starting program: /home/buildbot/slave/sage_git/build/local/bin/python 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Python 2.7.14 (default, Apr  1 2018, 08:09:42) 
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import cysignals
[New Thread 0xf6e95b70 (LWP 6523)]
[Thread 0xf6e95b70 (LWP 6523) exited]

Program received signal SIGSEGV, Segmentation fault.
0xf6e95bfc in ?? ()
(gdb) bt
#0  0xf6e95bfc in ?? ()
#1  0xf7c8caae in clone () from /lib/i386-linux-gnu/libc.so.6

Reinstalling cysignals-1.6.9 fixes it...

comment:7 Changed 2 years ago by vbraun

Better traceback: build with

SAGE_DEBUG=yes CFLAGS="-O0 -g" ./sage -p cysignals

and run

Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
(gdb) bt
#0  0x00000000 in ?? ()
#1  0xf788d169 in _sig_on_trampoline (dummy=0xfffff000) at build/src/cysignals/implementation.c:266
#2  0xf7c89260 in msync () from /lib/i386-linux-gnu/libc.so.6
#3  0xfffff000 in ?? ()
#4  0xf7fdbc30 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Last edited 2 years ago by vbraun (previous) (diff)

comment:8 Changed 2 years ago by git

  • Commit changed from 586cc2f4b8903d539de0e95b6a36c74924c29044 to 8b3a7c56bec2f70f96b9eee5fa8fba029bc53587

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8b3a7c5Upgrade cysignals to version 1.7.0

comment:9 Changed 2 years ago by jdemeyer

  • Status changed from needs_work to positive_review

I added a patch for cysignals. If this passes testing on the Sage buildbot, I will add it upstream too.

comment:10 Changed 2 years ago by vbraun

  • Branch changed from u/jdemeyer/sage___gdb_does_not_start_due_to_sigfpe to 8b3a7c56bec2f70f96b9eee5fa8fba029bc53587
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:11 follow-up: Changed 2 years ago by jhpalmieri

  • Commit 8b3a7c56bec2f70f96b9eee5fa8fba029bc53587 deleted

This ticket causes doctest failures on the latest OS X (10.13.4) + latest Xcode (9.3). I see the failures with 8.2.rc3 and also with 8.2.rc2 + this ticket, but not with plain 8.2.rc2. The failures:

sage -t --long src/sage/doctest/external.py  # Killed due to abort
sage -t --long src/sage/combinat/designs/ext_rep.py  # Killed due to abort

The first one (the second is similar):

sage -t --long src/sage/doctest/external.py
    Killed due to abort
**********************************************************************
Tests run before process (pid=46054) failed:
sage: from sage.doctest.external import has_internet ## line 38 ##
sage: has_internet() # random ## line 39 ##
objc[46054]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[46054]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
------------------------------------------------------------------------
0   signals.so                          0x00000001019a2548 print_backtrace + 40
1   ???                                 0x6c70704118105f6b 0x0 + 7813868778366721899
------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.

comment:12 Changed 2 years ago by jhpalmieri

With 8.2.rc3 + #25118 (so clang is used instead of gcc), I still get failures:

sage -t --long src/sage/doctest/external.py
    Killed due to abort
**********************************************************************
Tests run before process (pid=23617) failed:
sage: from sage.doctest.external import has_internet ## line 38 ##
sage: has_internet() # random ## line 39 ##
objc[23617]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[23617]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
------------------------------------------------------------------------
0   signals.so                          0x000000010f97294a print_backtrace + 58
1   signals.so                          0x000000010f976a13 sigdie + 67
2   signals.so                          0x000000010f976928 cysigs_signal_handler + 312
3   libsystem_platform.dylib            0x00007fff7c75ff5a _sigtramp + 26
4   ???                                 0x0000000000000000 0x0 + 0
5   libsystem_kernel.dylib              0x00007fff7c59b276 abort_with_payload_wrapper_internal + 0
6   libobjc.A.dylib                     0x00007fff7b839962 _ZL12_objc_fatalvyyPKcP13__va_list_tag + 108
7   libobjc.A.dylib                     0x00007fff7b839814 __objc_error + 0
8   libobjc.A.dylib                     0x00007fff7b83a43b _ZL25lockAndFinishInitializingP10objc_classS0_ + 0
9   libobjc.A.dylib                     0x00007fff7b82ae8f lookUpImpOrForward + 228
10  libobjc.A.dylib                     0x00007fff7b82a914 _objc_msgSend_uncached + 68
11  CoreFoundation                      0x00007fff540f1c22 CFDateCreate + 34
12  CoreFoundation                      0x00007fff540ef2f8 __CFBinaryPlistCreateObjectFiltered + 1608
13  CoreFoundation                      0x00007fff540f0ad7 __CFBinaryPlistCreateObjectFiltered + 7719
14  CoreFoundation                      0x00007fff540d567b __CFTryParseBinaryPlist + 187
15  CoreFoundation                      0x00007fff540d4f3e _CFPropertyListCreateWithData + 190
16  CoreFoundation                      0x00007fff540d4dd0 CFPropertyListCreateWithData + 80
17  CoreFoundation                      0x00007fff540ee0fb -[CFPrefsPlistSource handleReply:toRequestNewDataMessage:onConnection:retryCount:error:] + 683
18  CoreFoundation                      0x00007fff540ede23 __93-[CFPrefsSearchListSource handleReply:toRequestNewDataMessage:onConnection:retryCount:error:]_block_invoke + 147
19  libxpc.dylib                        0x00007fff7c7a5b2d xpc_array_apply + 57
20  CoreFoundation                      0x00007fff540edd5c -[CFPrefsSearchListSource handleReply:toRequestNewDataMessage:onConnection:retryCount:error:] + 300
21  CoreFoundation                      0x00007fff5427302c __80-[CFPrefsSearchListSource alreadylocked_generationCountFromListOfSources:count:]_block_invoke_3.139 + 76
22  CoreFoundation                      0x00007fff5429c1b4 -[_CFXPreferences withConnectionForRole:performBlock:] + 36
23  CoreFoundation                      0x00007fff54272fd5 __80-[CFPrefsSearchListSource alreadylocked_generationCountFromListOfSources:count:]_block_invoke_2.138 + 117
24  libsystem_trace.dylib               0x00007fff7c7829f0 _os_activity_initiate_impl + 53
25  CoreFoundation                      0x00007fff54272f32 __80-[CFPrefsSearchListSource alreadylocked_generationCountFromListOfSources:count:]_block_invoke.136 + 114
26  CoreFoundation                      0x00007fff54272877 CFPREFERENCES_IS_WAITING_FOR_USER_CFPREFSD + 39
27  CoreFoundation                      0x00007fff54272acc -[CFPrefsSearchListSource alreadylocked_generationCountFromListOfSources:count:] + 348
28  CoreFoundation                      0x00007fff540ec446 -[CFPrefsSearchListSource alreadylocked_copyDictionary] + 326
29  CoreFoundation                      0x00007fff540ebfc3 -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 67
30  CoreFoundation                      0x00007fff542302c5 -[CFPrefsSource copyValueForKey:] + 53
31  CoreFoundation                      0x00007fff5429ac80 __76-[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:]_block_invoke + 32
32  CoreFoundation                      0x00007fff54274279 __108-[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:]_block_invoke + 297
33  CoreFoundation                      0x00007fff542740e5 -[_CFXPreferences(SearchListAdditions) withSearchListForIdentifier:container:cloudConfigurationURL:perform:] + 341
34  CoreFoundation                      0x00007fff5429ac23 -[_CFXPreferences copyAppValueForKey:identifier:container:configurationURL:] + 131
35  CoreFoundation                      0x00007fff540f3713 CFPreferencesCopyAppValue + 99
36  SystemConfiguration                 0x00007fff6066b5e1 SCDynamicStoreCopyProxiesWithOptions + 155
37  _scproxy.so                         0x0000000110ce0ace get_proxies + 14
38  libpython2.7.dylib                  0x000000010f1c95e5 PyEval_EvalFrameEx + 9141
39  libpython2.7.dylib                  0x000000010f1d1b91 fast_function + 337
40  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
41  libpython2.7.dylib                  0x000000010f1d1b91 fast_function + 337
42  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
43  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
44  libpython2.7.dylib                  0x000000010f145684 function_call + 340
45  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
46  libpython2.7.dylib                  0x000000010f12be82 instancemethod_call + 162
47  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
48  libpython2.7.dylib                  0x000000010f1d14c2 PyEval_CallObjectWithKeywords + 162
49  libpython2.7.dylib                  0x000000010f129e33 PyInstance_New + 131
50  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
51  libpython2.7.dylib                  0x000000010f1c95ba PyEval_EvalFrameEx + 9098
52  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
53  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
54  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
55  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
56  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
57  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
58  libpython2.7.dylib                  0x000000010f1d1b91 fast_function + 337
59  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
60  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
61  libpython2.7.dylib                  0x000000010f1c74c4 PyEval_EvalFrameEx + 660
62  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
63  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
64  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
65  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
66  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
67  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
68  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
69  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
70  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
71  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
72  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
73  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
74  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
75  libpython2.7.dylib                  0x000000010f145684 function_call + 340
76  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
77  libpython2.7.dylib                  0x000000010f12be82 instancemethod_call + 162
78  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
79  libpython2.7.dylib                  0x000000010f1806d8 slot_tp_call + 168
80  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
81  libpython2.7.dylib                  0x000000010f1c95ba PyEval_EvalFrameEx + 9098
82  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
83  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
84  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
85  libpython2.7.dylib                  0x000000010f1d1b91 fast_function + 337
86  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
87  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
88  libpython2.7.dylib                  0x000000010f145684 function_call + 340
89  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
90  libpython2.7.dylib                  0x000000010f12be82 instancemethod_call + 162
91  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
92  libpython2.7.dylib                  0x000000010f18170f slot_tp_init + 175
93  libpython2.7.dylib                  0x000000010f17d5f9 type_call + 313
94  libpython2.7.dylib                  0x000000010f11c311 PyObject_Call + 97
95  libpython2.7.dylib                  0x000000010f1c95ba PyEval_EvalFrameEx + 9098
96  libpython2.7.dylib                  0x000000010f1d1b91 fast_function + 337
97  libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
98  libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
99  libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
100 libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
101 libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
102 libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
103 libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
104 libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
105 libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
106 libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
107 libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
108 libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
109 libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
110 libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
111 libpython2.7.dylib                  0x000000010f1d1aad fast_function + 109
112 libpython2.7.dylib                  0x000000010f1c941c PyEval_EvalFrameEx + 8684
113 libpython2.7.dylib                  0x000000010f1c6fd4 PyEval_EvalCodeEx + 2212
114 libpython2.7.dylib                  0x000000010f1c6722 PyEval_EvalCode + 34
115 libpython2.7.dylib                  0x000000010f1f4a5d PyRun_FileExFlags + 157
116 libpython2.7.dylib                  0x000000010f1f4594 PyRun_SimpleFileExFlags + 740
117 libpython2.7.dylib                  0x000000010f20a33f Py_Main + 3279
118 libdyld.dylib                       0x00007fff7c451015 start + 1
------------------------------------------------------------------------
Unhandled SIGABRT: An abort() occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.
------------------------------------------------------------------------

**********************************************************************
----------------------------------------------------------------------
sage -t --long src/sage/doctest/external.py  # Killed due to abort
----------------------------------------------------------------------
Total time for all tests: 0.3 seconds
    cpu time: 0.0 seconds
    cumulative wall time: 0.0 seconds
Last edited 2 years ago by jhpalmieri (previous) (diff)

comment:13 in reply to: ↑ 11 Changed 2 years ago by jdemeyer

I believe that we have an OS X buildbot, so I wonder why it passed there but not on your system.

comment:14 Changed 2 years ago by jdemeyer

Just in case, could you test with cysignals 1.7.1 (#25189).

comment:15 Changed 2 years ago by jdemeyer

Could you test whether the error occurs only during doctesting or also in a "real" Sage session.

comment:16 Changed 2 years ago by jdemeyer

Indeed, on the buildbot system I cannot reproduce the problem:

osx:sage jdemeyer$ uname -a
Darwin osx 17.5.0 Darwin Kernel Version 17.5.0: Mon Mar  5 22:24:32 PST 2018; root:xnu-4570.51.1~1/RELEASE_X86_64 x86_64
osx:sage jdemeyer$ ./sage -tp --long src/sage/doctest/external.py 
Running doctests with ID 2018-04-20-13-37-41-fc3b58ea.
Git branch: HEAD
Using --optional=mpir,openssl,python2,sage
Doctesting 1 file using 8 threads.
sage -t --long --warn-long 227.1 src/sage/doctest/external.py
    [38 tests, 2.01 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 2.1 seconds
    cpu time: 1.5 seconds
    cumulative wall time: 2.0 seconds

comment:17 Changed 2 years ago by jhpalmieri

I see it on two different systems, so at least it isn't just one flaky machine. I do not see the problem in a real Sage session, or at least

sage: from sage.doctest.external import has_internet
sage: has_internet()

does not kill Sage. It reports False, which is unexpected, but it does not produce an error.

comment:18 Changed 2 years ago by jhpalmieri

#25189 doesn't help.

comment:19 follow-up: Changed 2 years ago by jhpalmieri

On one machine, I also renamed /usr/local/bin, /usr/local/lib, /usr/local/include, in case something there was interfering. No difference. The only thing that helped was to do export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES before building Sage. I tried just putting this in cysignals/spkg-install (before the sdh_configure line) but that didn't help.

comment:20 in reply to: ↑ 19 Changed 2 years ago by jhpalmieri

Replying to jhpalmieri:

On one machine, I also renamed /usr/local/bin, /usr/local/lib, /usr/local/include, in case something there was interfering. No difference. The only thing that helped was to do export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES before building Sage. I tried just putting this in cysignals/spkg-install (before the sdh_configure line) but that didn't help.

In fact it looks like setting export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES before doctesting is good enough (and annoyingly, case is important: "YES" works but "yes" does not).

Last edited 2 years ago by jhpalmieri (previous) (diff)
Note: See TracTickets for help on using tickets.