Opened 12 years ago

Closed 10 years ago

#10113 closed defect (fixed)

Calling seed() incorrectly causes Sage to crash with SIGSEGV (Segmentation Fault)

Reported by: drkirkby Owned by: jason
Priority: major Milestone: sage-5.1
Component: misc Keywords:
Cc: Merged in: sage-5.1.beta0
Authors: Simon King Reviewers: Volker Braun
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description

Using several versions of Sage on several operating systems:

  • 4.5.rc0 on OS X (bsd.math)
  • 4.5.2 on Linux (sage.math)
  • sage-4.6.alpha3 on OpenSolaris (my machine hawk)

it is possible to crash Sage quite easily with:

sage: seed(1,2)


------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occurred in Sage.
This probably occurred because a *compiled* component
of Sage has a bug in it (typically accessing invalid memory)
or is not properly wrapped with _sig_on, _sig_off.
You might want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate (sorry).
------------------------------------------------------------

Running under OpenSolaris with gdb, I see:

drkirkby@hawk:~/sage-4.6.alpha3$ ./sage -gdb
----------------------------------------------------------------------
| Sage Version 4.6.alpha3, Release Date: 2010-10-08                  |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
**********************************************************************
*                                                                    *
* Warning: this is a prerelease version, and it may be unstable.     *
*                                                                    *
**********************************************************************
/export/home/drkirkby/sage-4.6.alpha3/local/bin/sage-ipython
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-pc-solaris2.11"...
warning: Lowest section in /lib/libdl.so.1 is .dynamic at 00000074
Python 2.6.4 (r264:75706, Oct  9 2010, 22:06:12) 
[GCC 4.5.0] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
warning: Lowest section in /lib/libintl.so.1 is .dynamic at 00000074
warning: Lowest section in /lib/libpthread.so.1 is .dynamic at 00000074
sage: seed(1,2)

Program received signal SIGSEGV, Segmentation fault.
0xfdbdaf1d in __gmp_randclear ()
   from /export/home/drkirkby/sage-4.6.alpha3/local/lib//libgmp.so.3
(gdb) bt
#0  0xfdbdaf1d in __gmp_randclear ()
   from /export/home/drkirkby/sage-4.6.alpha3/local/lib//libgmp.so.3
#1  0x081889b8 in ?? ()
#2  0x081c1ffc in ?? ()
#3  0x00000002 in ?? ()
#4  0x00000034 in ?? ()
#5  0x00000000 in ?? ()

so it looks like the random number generator thats being seeded (or not as the case may be) is part of gmp, and not gsl or anywhere else in Sage that has random number generators.

Plenty of other arguments to seed() can cause Sage to crash too.

Dave

Attachments (1)

trac_10113_randstate_cinit.patch (1.5 KB) - added by SimonKing 10 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 Changed 10 years ago by SimonKing

The backtrace from the ticket description is not very clear (all those question marks). Here is more information:

sage: seed(1,2)

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff218ae64 in __gmp_randclear ()
   from /mnt/local/king/SAGE/experiment/notebook/sage-5.1.notebook/local/lib/libgmp.so.8
(gdb) bt
#0  0x00007ffff218ae64 in __gmp_randclear ()
   from /mnt/local/king/SAGE/experiment/notebook/sage-5.1.notebook/local/lib/libgmp.so.8
#1  0x00007ffff260fa79 in __pyx_pf_4sage_4misc_9randstate_9randstate_12__dealloc__ (o=0x4891050)
    at sage/misc/randstate.c:3177
#2  __pyx_tp_dealloc_4sage_4misc_9randstate_randstate (o=0x4891050) at sage/misc/randstate.c:3814
#3  0x00007ffff7ab240c in type_call (type=<value optimized out>, args=0x4880878, kwds=0x0)
    at Objects/typeobject.c:738
#4  0x00007ffff7a4f303 in PyObject_Call (func=0x7ffff28224c0, arg=0x7fffffffc370, kw=0x7fffffffc368)
    at Objects/abstract.c:2529
#5  0x00007ffff7aff2af in do_call (f=0x4978ac0, throwflag=<value optimized out>) at Python/ceval.c:4231
#6  call_function (f=0x4978ac0, throwflag=<value optimized out>) at Python/ceval.c:4036
#7  PyEval_EvalFrameEx (f=0x4978ac0, throwflag=<value optimized out>) at Python/ceval.c:2666
#8  0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0x4820d30, globals=<value optimized out>, 
    locals=<value optimized out>, args=0x0, argcount=77043152, kws=<value optimized out>, kwcount=0, 
    defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#9  0x00007ffff7b01a42 in PyEval_EvalCode (co=0x4891068, globals=0x7fffffffc370, locals=0x7fffffffc368)
    at Python/ceval.c:667
#10 0x00007ffff7b0065a in exec_statement (f=0x4367090, throwflag=<value optimized out>) at Python/ceval.c:4710
#11 PyEval_EvalFrameEx (f=0x4367090, throwflag=<value optimized out>) at Python/ceval.c:1880
#12 0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0xabaeb0, globals=<value optimized out>, 
    locals=<value optimized out>, args=0x43669e0, argcount=77043152, kws=<value optimized out>, kwcount=0, 
    defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#13 0x00007ffff7affdf8 in fast_function (f=0x4366840, throwflag=<value optimized out>) at Python/ceval.c:4109
#14 call_function (f=0x4366840, throwflag=<value optimized out>) at Python/ceval.c:4034
#15 PyEval_EvalFrameEx (f=0x4366840, throwflag=<value optimized out>) at Python/ceval.c:2666
#16 0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0xabae30, globals=<value optimized out>, 
    locals=<value optimized out>, args=0x4, argcount=77043152, kws=<value optimized out>, kwcount=0, 
    defs=0xb1d260, defcount=2, closure=0x0) at Python/ceval.c:3253
#17 0x00007ffff7affdf8 in fast_function (f=0x48d1ba0, throwflag=<value optimized out>) at Python/ceval.c:4109
#18 call_function (f=0x48d1ba0, throwflag=<value optimized out>) at Python/ceval.c:4034
#19 PyEval_EvalFrameEx (f=0x48d1ba0, throwflag=<value optimized out>) at Python/ceval.c:2666
#20 0x00007ffff7b0103a in fast_function (f=0xb8e2d0, throwflag=<value optimized out>) at Python/ceval.c:4099
#21 call_function (f=0xb8e2d0, throwflag=<value optimized out>) at Python/ceval.c:4034
#22 PyEval_EvalFrameEx (f=0xb8e2d0, throwflag=<value optimized out>) at Python/ceval.c:2666
#23 0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0xaba9b0, globals=<value optimized out>, 
    locals=<value optimized out>, args=0xb85508, argcount=77043152, kws=<value optimized out>, kwcount=0, 
    defs=0xb1e3e8, defcount=1, closure=0x0) at Python/ceval.c:3253
#24 0x00007ffff7affdf8 in fast_function (f=0xb85380, throwflag=<value optimized out>) at Python/ceval.c:4109
#25 call_function (f=0xb85380, throwflag=<value optimized out>) at Python/ceval.c:4034
#26 PyEval_EvalFrameEx (f=0xb85380, throwflag=<value optimized out>) at Python/ceval.c:2666
#27 0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0xaba530, globals=<value optimized out>, 
    locals=<value optimized out>, args=0x680a60, argcount=77043152, kws=<value optimized out>, kwcount=0, 
    defs=0xb1e3a8, defcount=1, closure=0x0) at Python/ceval.c:3253
#28 0x00007ffff7affdf8 in fast_function (f=0x6808d0, throwflag=<value optimized out>) at Python/ceval.c:4109
#29 call_function (f=0x6808d0, throwflag=<value optimized out>) at Python/ceval.c:4034
#30 PyEval_EvalFrameEx (f=0x6808d0, throwflag=<value optimized out>) at Python/ceval.c:2666
#31 0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0x7ffff7e5b0b0, globals=<value optimized out>, 
    locals=<value optimized out>, args=0x3, argcount=77043152, kws=<value optimized out>, kwcount=2, 
    defs=0xa0a800, defcount=2, closure=0x0) at Python/ceval.c:3253
#32 0x00007ffff7affdf8 in fast_function (f=0x6ec2f0, throwflag=<value optimized out>) at Python/ceval.c:4109
#33 call_function (f=0x6ec2f0, throwflag=<value optimized out>) at Python/ceval.c:4034
#34 PyEval_EvalFrameEx (f=0x6ec2f0, throwflag=<value optimized out>) at Python/ceval.c:2666
#35 0x00007ffff7b01928 in PyEval_EvalCodeEx (co=0x7ffff7f11ab0, globals=<value optimized out>, 
    locals=<value optimized out>, args=0x0, argcount=77043152, kws=<value optimized out>, kwcount=0, 
    defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3253
#36 0x00007ffff7b01a42 in PyEval_EvalCode (co=0x4891068, globals=0x7fffffffc370, locals=0x7fffffffc368)
    at Python/ceval.c:667
#37 0x00007ffff7b21950 in run_mod (fp=0x6c7e60, filename=<value optimized out>, start=<value optimized out>, 
    globals=<value optimized out>, locals=0x640260, closeit=0, flags=0x7fffffffd560)
    at Python/pythonrun.c:1346
#38 PyRun_FileExFlags (fp=0x6c7e60, filename=<value optimized out>, start=<value optimized out>, 
    globals=<value optimized out>, locals=0x640260, closeit=0, flags=0x7fffffffd560)
    at Python/pythonrun.c:1332
#39 0x00007ffff7b21b1c in PyRun_SimpleFileExFlags (fp=<value optimized out>, 
    filename=0x7fffffffe987 "/mnt/local/king/SAGE/experiment/notebook/sage-5.1.notebook/local/bin/sage-ipython", closeit=0, flags=0x7fffffffd560) at Python/pythonrun.c:936
#40 0x00007ffff7b3711d in RunStartupFile (argc=<value optimized out>, argv=<value optimized out>)
    at Modules/main.c:143
#41 Py_Main (argc=<value optimized out>, argv=<value optimized out>) at Modules/main.c:553
#42 0x00007ffff6e1dc8d in __libc_start_main (main=<value optimized out>, argc=<value optimized out>, 
    ubp_av=<value optimized out>, init=<value optimized out>, fini=<value optimized out>, 
    rtld_fini=<value optimized out>, stack_end=0x7fffffffd678) at libc-start.c:228
#43 0x0000000000400699 in _start ()

So, the latest sage-specific thing is sage.misc.randstate.randstate.__dealloc__, which does nothing but

         gmp_randclear(self.gmp_state)

self.gmp_state is initialised in __init__. I am surprised: Aren't those things supposed to be done in __cinit__, so that initialisation of the c-data is granted even if Python initialisation fails?

Anyway, renaming __init__ into __cinit__ does not solve the problem.

comment:2 Changed 10 years ago by SimonKing

By the way, I tried with other input values: seed(a,b) seems to result in a segfault, independent of the values a and b.

comment:3 Changed 10 years ago by SimonKing

Now that's strange. I inserted print statements into the new __cinit__ and the existing __init__ and __dealloc__ methods of sage.misc.randstate.randstate, namely printing id(self). It turns out that, when doing seed(a,b), __dealloc__ is called on an object without __cinit__ called previously.

How can that happen?

comment:4 Changed 10 years ago by SimonKing

More precisely: When I have

     def __cinit__(self, seed=None):
         print "id for alloc",id(self)
         gmp_randinit_default(self.gmp_state)
     def __init__(self, seed=None):
         print "seed for init",seed
         print "id for init",id(self)
         cdef mpz_t mpz_seed

         if seed is None:
             if use_urandom:
                 seed = long(binascii.hexlify(os.urandom(16)), 16)
             else:
                 seed = long(time.time() * 256)
         else:
             seed = long(seed)

         # If seed==0, leave it at the default seed used by
         # gmp_randinit_default()
         if seed:
             mpz_init(mpz_seed)
             mpz_set_pylong(mpz_seed, seed)
             gmp_randseed(self.gmp_state, mpz_seed)
             mpz_clear(mpz_seed)

         self._seed = seed
     def __dealloc__(self):
         print "id for dealloc",id(self)
         gmp_randclear(self.gmp_state)

I get:

----------------------------------------------------------------------
| Sage Version 5.1.notebook, Release Date: 2012-04-15                |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
**********************************************************************
*                                                                    *
* Warning: this is a prerelease version, and it may be unstable.     *
*                                                                    *
**********************************************************************
seed in cinit None
id for cinit 12399568
seed for init None
id for init 12399568
seed in cinit None
id for cinit 75637328
seed for init None
id for init 75637328
id for dealloc 12399568
sage: seed(1)
seed in cinit 1
id for cinit 75640528
seed for init 1
id for init 75640528
<sage.misc.randstate.randstate object at 0x4822ed0>
sage: seed(2)  
seed in cinit 2
id for cinit 76091984
seed for init 2
id for init 76091984
<sage.misc.randstate.randstate object at 0x4891250>
sage: seed(1,2)
id for dealloc 76092624
/mnt/local/king/SAGE/experiment/notebook/sage-5.1.notebook/local/lib/libcsage.so(print_backtrace+0x31)[0x7fd1916192d0]

You see: The last line has a dealloc, without a previous alloc.

comment:5 Changed 10 years ago by SimonKing

Yes, I got it!

The problem is as follows:

  • randstate.__init__ takes exactly one argument seed (beside self).
  • You give it two arguments. Thus, __init__ fails before being executed.
  • After that, __dealloc__ is called, but the c-data are not initialised, because __init__ has not beeen executed.

I think the following is a clean solution:

  • Add a __cinit__ method that accepts any arguments (so that no error is raised before it is executed).
  • Initialise c-data in __cinit__
  • If calling __init__ fails, then __cinit__ has already been executed. Hence, __dealloc__ works, and thus one just sees the due error, but no segfault.

Preparing a patch now!

Changed 10 years ago by SimonKing

comment:6 Changed 10 years ago by SimonKing

  • Authors set to Simon King
  • Status changed from new to needs_review

comment:7 follow-up: Changed 10 years ago by cremona

The patch applies to 5.0.beta14 and does the trick for me (I tried a variety of bogus inputs).

I'm not sufficiently qualified to verify that Simon's dignosis is correct, though it looks pretty convincing. I would be slightly happier if the error message did not refer to "init" since the user will not have typed that (probably), but that's a minor point.

So, +1 towards a positive review.

comment:8 in reply to: ↑ 7 ; follow-up: Changed 10 years ago by SimonKing

Replying to cremona:

I would be slightly happier if the error message did not refer to "__init__" since the user will not have typed that (probably), but that's a minor point.

I don't think that the error message can be changed without patching Python. Theoretically, we could test in __cinit__ that the number of arguments for __init__ is correct - but I think that would be a misuse of __cinit__.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 10 years ago by cremona

Replying to SimonKing:

Replying to cremona:

I would be slightly happier if the error message did not refer to "__init__" since the user will not have typed that (probably), but that's a minor point.

I don't think that the error message can be changed without patching Python. Theoretically, we could test in __cinit__ that the number of arguments for __init__ is correct - but I think that would be a misuse of __cinit__.

I didn't mean anything so complicated. But I compared for example

sage: EllipticCurve([1])
...
ValueError: sequence of coefficients must have length 2 or 5

or

sage: ZZ(x)
...
TypeError: unable to convert x (=x) to an integer

which are similar in so far as they are reporting invalid input to a constructor. It's not serious (and I certainly do not want to detract from the impressive work you did in tracking this down!).

comment:10 in reply to: ↑ 9 Changed 10 years ago by SimonKing

Replying to cremona:

which are similar in so far as they are reporting invalid input to a constructor.

These examples are totally different, actually. In your example, the functions or methods are called, and then inside the function/method the arguments are checked and an error is raised. In my example, Python tests whether the number of arguments is correct, which is not the case, and thus the error is raised before the method is called.

More differences:

When you do EllipticCurve([1]) you are calling a function

sage: type(EllipticCurve)
<type 'function'>

and the function accepts anything between 1 and 3 arguments. Compare:

sage: EllipticCurve(1,2,3,4)
Traceback (most recent call last)
...
TypeError: EllipticCurve() takes at most 3 arguments (4 given)

When you call ZZ(x), then you are calling the __call__ method of ZZ, which accepts any number of arguments:

sage: from sage.misc.sageinspect import sage_getargspec
sage: sage_getargspec(ZZ)
ArgSpec(args=['self', 'x', '*args', '**kwds'], varargs=None, keywords=None, defaults=(0,))

However, in my example, a class is called. So, that would be something like

sage: ZZ.__class__(x)
Traceback (most recent call last)
...
TypeError: __init__() takes exactly 0 positional arguments (1 given)

comment:11 follow-up: Changed 10 years ago by SimonKing

PS: Since ZZ(...) forwards its arguments to the init method of Integer, you obtain:

sage: ZZ(1,2,3,4,5,6,7,8,9)
Traceback (most recent call last):
...
TypeError: __init__() takes at most 2 positional arguments (9 given)
Last edited 10 years ago by SimonKing (previous) (diff)

comment:12 in reply to: ↑ 11 Changed 10 years ago by cremona

Replying to SimonKing:

PS: Since ZZ(...) forwards its arguments to the init method of Integer, you obtain:

sage: ZZ(1,2,3,4,5,6,7,8,9)
Traceback (most recent call last):
...
TypeError: __init__() takes at most 2 positional arguments (9 given)

I submit!

comment:13 Changed 10 years ago by vbraun

  • Reviewers set to Volker Braun
  • Status changed from needs_review to positive_review

Agree. In Cython, you must initialize C variables in __cinit__ to a sane state, and destroy them in __dealloc__. There is no guarantee that __init__ runs all the way or even runs at all.

comment:14 Changed 10 years ago by jdemeyer

  • Milestone changed from sage-5.0 to sage-5.1

comment:15 Changed 10 years ago by jdemeyer

  • Merged in set to sage-5.1.beta0
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.