Opened 8 years ago

Closed 8 years ago

#12157 closed defect (fixed)

Segfault in __Pyx_check_binary_version

Reported by: vbraun Owned by: jason, was
Priority: blocker Milestone: sage-4.8
Component: linear algebra Keywords:
Cc: malb, was, ddrake, fbissey, strogdon Merged in: sage-4.8.alpha5
Authors: François Bissey Reviewers: Volker Braun
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

Sage 4.8.alpha4 doesn't start up on Ubuntu oneiric (11.10) but dies with the segfault

#0  0x0000000000012336 in ?? ()
#1  0x00007fffdf226612 in __Pyx_check_binary_version () at sage/matrix/matrix_modn_dense_float.cpp:17227
#2  initmatrix_modn_dense_float () at sage/matrix/matrix_modn_dense_float.cpp:15808
#3  0x00007ffff7b1c625 in _PyImport_LoadDynamicModule (
    name=0x7ffffffc81c0 "sage.matrix.matrix_modn_dense_float",
    pathname=0x7ffffffc70f0
"/home/wstein/sage/sage-4.8.alpha5/local/lib/python2.6/site-packages/sage/matrix/matrix_modn_dense_float.so",
fp=<optimized out>) at ./Python/importdl.c:53
#4  0x00007ffff7b1a6bc in import_submodule (mod=0x17df750,
    subname=0x7ffffffc81cc "matrix_modn_dense_float",
    fullname=0x7ffffffc81c0 "sage.matrix.matrix_modn_dense_float") at Python/import.c:2589
#5  0x00007ffff7b1a93f in load_next (mod=0x17df750, altmod=0x17df750, p_name=<optimized out>,
    buf=0x7ffffffc81c0 "sage.matrix.matrix_modn_dense_float", p_buflen=0x7ffffffc81b0)
    at Python/import.c:2409

This problem has been identified on sage-on-gentoo https://github.com/cschwan/sage-on-gentoo/issues/108 as an undefined givaro symbol when dloading sage.matrix.matrix_modn_dense_float, and can be fixed by specifying the library in module_list.py

Analysis:

  1. This bug depends on architecture and the dload order of various modules. This is why it has not been observed initially.
  2. In case of a linking error, Pyx_check_binary_version should give a useful error and not just segfault. I believe this is a bug in Sage's low-level error handling.

Attachments (1)

trac_12157_modn_givaro.patch (1.1 KB) - added by vbraun 8 years ago.
Initial patch

Download all attachments as: .zip

Change History (11)

Changed 8 years ago by vbraun

Initial patch

comment:1 Changed 8 years ago by malb

  • Cc malb added

comment:2 in reply to: ↑ description Changed 8 years ago by jdemeyer

  • Authors set to Volker Braun
  • Priority changed from major to blocker
  • Reviewers set to Jeroen Demeyer

Replying to vbraun:

In case of a linking error, Pyx_check_binary_version should give a useful error and not just segfault. I believe this is a bug in Sage's low-level error handling.

Why do you think this is a fault of Sage, and not Cython for example?

Patch looks very reasonable, I would give it positive review if it needs review.

comment:3 Changed 8 years ago by vbraun

  • Cc was ddrake added
  • Status changed from new to needs_review

When I say "Sage's fault" that may very well be the fault of one of the contained components (i.e. Cython), I didn't mean to be more specific.

I'm just running tests and then will set it to positive review.

comment:4 Changed 8 years ago by jdemeyer

In devel/sage/sage/matrix/matrix_modn_dense_float.cpp (and any other Cython .c and .cpp file), there is a function

static int __Pyx_check_binary_version(void) {
    char ctversion[4], rtversion[4];
    PyOS_snprintf(ctversion, 4, "%d.%d", PY_MAJOR_VERSION, PY_MINOR_VERSION);
    PyOS_snprintf(rtversion, 4, "%s", Py_GetVersion());
    if (ctversion[0] != rtversion[0] || ctversion[2] != rtversion[2]) {
        char message[200];
        PyOS_snprintf(message, sizeof(message),
                      "compiletime version %s of module '%.100s' "
                      "does not match runtime version %s",
                      ctversion, __Pyx_MODULE_NAME, rtversion);
        #if PY_VERSION_HEX < 0x02050000
        return PyErr_Warn(NULL, message);
        #else
        return PyErr_WarnEx(NULL, message, 1);
        #endif
    }
    return 0;
}

Maybe you could try adding some print statements to see what is going wrong...

comment:5 Changed 8 years ago by fbissey

  • Cc fbissey added

comment:6 Changed 8 years ago by strogdon

  • Cc strogdon added

comment:7 Changed 8 years ago by vbraun

  • Authors changed from Volker Braun to François Bissey
  • Reviewers changed from Jeroen Demeyer to Volker Braun
(gdb) bt 5
#0  0x00019df6 in ?? ()
#1  0xb37d6484 in __Pyx_check_binary_version () at sage/matrix/matrix_modn_dense_float.cpp:17227
#2  0xb37d105f in initmatrix_modn_dense_float () at sage/matrix/matrix_modn_dense_float.cpp:15808
#3  0xb7f2e50c in _PyImport_LoadDynamicModule (name=0xbffca9f8 "sage.matrix.matrix_modn_dense_float", 
    pathname=0xbffc996b "/home/vbraun/tesla/sage-4.8.alpha4/local/lib/python2.6/site-packages/sage/matrix/matrix_modn_dense_float.so", fp=0x8cc69d8) at ./Python/importdl.c:53
#4  0xb7f2bf50 in load_module (name=0xbffca9f8 "sage.matrix.matrix_modn_dense_float", 
    fp=<optimized out>, 
    buf=0xbffc996b "/home/vbraun/tesla/sage-4.8.alpha4/local/lib/python2.6/site-packages/sage/matrix/matrix_modn_dense_float.so", type=3, loader=0x0) at Python/import.c:1828
(More stack frames follow...)
(gdb) frame 0
#0  0x00019df6 in ?? ()
(gdb) print *0x00019df6
Cannot access memory at address 0x19df6
(gdb) up
#1  0xb37d6484 in __Pyx_check_binary_version () at sage/matrix/matrix_modn_dense_float.cpp:17227
17227	    PyOS_snprintf(ctversion, 4, "%d.%d", PY_MAJOR_VERSION, PY_MINOR_VERSION);
(gdb) disassemble
Dump of assembler code for function __Pyx_check_binary_version():
   0xb37d6437 <+0>:	push   %ebp
   0xb37d6438 <+1>:	mov    %esp,%ebp
   0xb37d643a <+3>:	push   %ebx
   0xb37d643b <+4>:	sub    $0x104,%esp
   0xb37d6441 <+10>:	call   0xb37b4087 <__i686.get_pc_thunk.bx>
   0xb37d6446 <+15>:	add    $0x32bae,%ebx
   0xb37d644c <+21>:	mov    %gs:0x14,%eax
   0xb37d6452 <+27>:	mov    %eax,-0xc(%ebp)
   0xb37d6455 <+30>:	xor    %eax,%eax
   0xb37d6457 <+32>:	movl   $0x6,0x10(%esp)
   0xb37d645f <+40>:	movl   $0x2,0xc(%esp)
   0xb37d6467 <+48>:	lea    -0x97a9(%ebx),%eax
   0xb37d646d <+54>:	mov    %eax,0x8(%esp)
   0xb37d6471 <+58>:	movl   $0x4,0x4(%esp)
   0xb37d6479 <+66>:	lea    -0x14(%ebp),%eax
   0xb37d647c <+69>:	mov    %eax,(%esp)
   0xb37d647f <+72>:	call   0xb37b2df0 <PyOS_snprintf@plt>
=> 0xb37d6484 <+77>:	call   0xb37b2850 <Py_GetVersion@plt>
   0xb37d6489 <+82>:	mov    %eax,0xc(%esp)
   0xb37d648d <+86>:	lea    -0x97a3(%ebx),%eax

It dies when calling PyOS_snprintf, it seems like something went very wrong when trying to resolve the givaro symbols. I don't understand why that should make Python and/or glibc symbols undefined but apparently it does. Of course it works fine in the debugger:

(gdb) print PyOS_snprintf(ctversion, 4, "%d.%d", 2, 6)
$15 = 3
(gdb) print ctversion
$16 = "2.6"

In any case, I'm giving positive review to Francois' patch: We clearly should link with all necessary libraries. Though I still don't understand why it segfaults the way it does.

comment:8 Changed 8 years ago by fbissey

I am not sure why it dies that way either, on Gentoo we link everything with --as-needed and if don't add givaro to the list of libraries it is not linked at all with these two libraries. But I cannot add anything compared to the report on the sage-on-gentoo tracker.

I'll leave it for you to push "positive review" button.

comment:9 Changed 8 years ago by vbraun

  • Status changed from needs_review to positive_review

comment:10 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.8.alpha5
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.