#12480 closed defect (fixed)
NTL segfault on OS X 10.7
Reported by: | jdemeyer | Owned by: | was |
---|---|---|---|
Priority: | critical | Milestone: | sage-5.0 |
Component: | interfaces | Keywords: | |
Cc: | Merged in: | sage-5.0.beta8 | |
Authors: | David Roe | Reviewers: | William Stein, Jeroen Demeyer |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
On OS X 10.7 (compiled with gcc-4.6.2, #12369):
The following tests failed: sage -t --long -force_lib devel/sage/sage/rings/padics/padic_ZZ_pX_CR_element.pyx # Killed/crashed
This is because executing the following code (from the file devel/sage/sage/rings/padics/padic_ZZ_pX_CR_element.pyx
):
R = Zp(5,5) S.<x> = R[] f = x^5 + 75*x^3 - 15*x^2 +125*x - 5 W.<w> = R.ext(f) pol = ntl.ZZ_pX([5^40,5^42,3*5^41], 5^44) W(pol, relprec = 0)
crashes Sage:
Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x0000000300000002 0x0000000101791ae1 in CopyPointer (dst=@0x10190b700, src=0x300000002) at ZZ_p.c:157 157 if (src->ref_count == NTL_MAX_LONG) Current language: auto; currently c++ (gdb) bt #0 0x0000000101791ae1 in CopyPointer (dst=@0x10190b700, src=0x300000002) at ZZ_p.c:157 #1 0x0000000101791d46 in NTL::ZZ_pContext::restore (this=0x100191148) at ZZ_p.c:205 #2 0x000000010164e1aa in ZZ_pX_conv_modulus () #3 0x00000001063a1ff7 in __pyx_f_4sage_5rings_6padics_16pow_computer_ext_ZZ_pX_eis_shift_p (__pyx_v_self=0x10db203f0, __pyx_v_x=0x10e057f18, __pyx_v_a=0x10e057f18, __pyx_v_n=200, __pyx_v_finalprec=0) at pow_computer_ext.cpp:6173 #4 0x0000000106392871 in __pyx_f_4sage_5rings_6padics_16pow_computer_ext_27PowComputer_ZZ_pX_small_Eis_eis_shift (__pyx_v_self=0x10db203f0, __pyx_v_x=0x10e057f18, __pyx_v_a=0x10e057f18, __pyx_v_n=200, __pyx_v_finalprec=0) at pow_computer_ext.cpp:12917 #5 0x000000010638aced in __pyx_f_4sage_5rings_6padics_16pow_computer_ext_17PowComputer_ZZ_pX_eis_shift_capdiv (__pyx_v_self=0x10db203f0, __pyx_v_x=0x10e057f18, __pyx_v_a=0x10e057f18, __pyx_v_n=200, __pyx_v_finalprec=0) at pow_computer_ext.cpp:9448 #6 0x00000001064e0aba in __pyx_f_4sage_5rings_6padics_22padic_ZZ_pX_CR_element_18pAdicZZpXCRElement__internal_lshift (__pyx_v_self=0x10e057ef0, __pyx_v_shift=-200) at padic_ZZ_pX_CR_element.cpp:9197 #7 0x00000001064dfd8b in __pyx_f_4sage_5rings_6padics_22padic_ZZ_pX_CR_element_18pAdicZZpXCRElement__set_from_ZZ_pX_part2 (__pyx_v_self=0x10e057ef0, __pyx_v_poly=0x10e075f98) at padic_ZZ_pX_CR_element.cpp:8445 #8 0x00000001064de4c7 in __pyx_f_4sage_5rings_6padics_22padic_ZZ_pX_CR_element_18pAdicZZpXCRElement__set_from_ZZ_pX_rel (__pyx_v_self=0x10e057ef0, __pyx_v_poly=0x10e075f98, __pyx_v_ctx=0x10e078730, __pyx_v_relprec=0) at padic_ZZ_pX_CR_element.cpp:8146 #9 0x00000001064fe4ee in __pyx_pf_4sage_5rings_6padics_22padic_ZZ_pX_CR_element_18pAdicZZpXCRElement___init__ (__pyx_v_self=0x10e057ef0, __pyx_args=0x10db23d40, __pyx_kwds=0x10da8f290) at padic_ZZ_pX_CR_element.cpp:5163 #10 0x000000010007a1a7 in type_call () Previous frame inner to this frame (gdb could not unwind past this frame)
This is a huge can of worms. The relevant code is in sage/rings/padics/pow_computer_ext.pyx
:
cdef ntl_ZZ_pContext_class get_context(self, long n): """ Returns the context for p^n. Note that this function will raise an Index error if n > self.cache_limit. Also, it will return None on input 0 [...] """ if n < 0: n = -n try: return self.c[n] except IndexError: return PowComputer_ZZ_pX.get_context(self, n)
On input 0, we have a Py_None
typecast to a ntl_ZZ_pContext_class
and we pretend like that's okay. How does this make sense? The fact that it crashes on OS X 10.7 is expected, the surprising thing is that this doesn't crash on other systems.
The None
in the self.c[]
array comes from line 1671 in pow_computer_ext.pyx
:
self.c.append(None) for i from 1 <= i <= cache_limit: self.c.append(PowComputer_ZZ_pX.get_context(self,i))
Attachments (2)
Change History (21)
comment:1 Changed 10 years ago by
- Description modified (diff)
comment:2 Changed 10 years ago by
- Description modified (diff)
comment:3 Changed 10 years ago by
- Description modified (diff)
comment:4 Changed 10 years ago by
- Description modified (diff)
Changed 10 years ago by
comment:5 Changed 10 years ago by
- Status changed from new to needs_review
comment:6 Changed 10 years ago by
- Status changed from needs_review to positive_review
comment:7 Changed 10 years ago by
- Reviewers set to William Stein, Jeroen Demeyer
- Status changed from positive_review to needs_work
Reviewer patch fixes some documentation strings and also removes exceptions which will be ignored anyway (a cdef function can only raise exceptions if it is declared with an "except" value or if it returns a Python object).
comment:8 Changed 10 years ago by
- Status changed from needs_work to needs_review
comment:9 Changed 10 years ago by
William or David, could any of you please review my reviewer patch?
comment:10 follow-up: ↓ 11 Changed 10 years ago by
- Status changed from needs_review to needs_work
David and I discussed ValueError? versus RuntimeError? when he wrote this. I initially suggested ValueError? (as you changed it to), but he argued that this error should *never* occur, ever -- there should be no way to call this code from Python to produce this error. I'm convinced by David.
comment:11 in reply to: ↑ 10 Changed 10 years ago by
Replying to was:
he argued that this error should *never* occur, ever
I don't see why this is an argument for RuntimeError
instead of ValueError
.
How would you like SystemError
instead? Of all standard Python exceptions, it seems to be the closest equivalent to what you want: http://docs.python.org/library/exceptions.html#exceptions.SystemError
comment:12 Changed 10 years ago by
* ping *
comment:13 Changed 10 years ago by
* ahem *
comment:14 follow-up: ↓ 19 Changed 10 years ago by
Sorry: I wasn't getting trac notifications on this ticket for some reason.
The changes look good, except the removal of the three checks for n < 0 in pow_mpz_t_tmp
and pow_ZZ_tmp
where a ValueError? is raised. While I understand that Cython will ignore those errors, I think the right option is to open another ticket and modify what is done, rather than just removing the checks. With these changes it's easier for these functions to segfault on incorrect input.
comment:15 Changed 10 years ago by
- Status changed from needs_work to needs_review
I kept all raise
statements but annotated them by
# Exception will be ignored by Cython
comment:16 follow-up: ↓ 17 Changed 10 years ago by
- Status changed from needs_review to positive_review
Looks good. Have you created a ticket to fix these ignored exceptions, or shall I?
comment:17 in reply to: ↑ 16 Changed 10 years ago by
Replying to roed:
Looks good. Have you created a ticket to fix these ignored exceptions, or shall I?
I don't plan to deal with this code, so go ahead.
comment:18 Changed 10 years ago by
- Merged in set to sage-5.0.beta8
- Resolution set to fixed
- Status changed from positive_review to closed
comment:19 in reply to: ↑ 14 Changed 8 years ago by
Replying to roed:
The changes look good, except the removal of the three checks for n < 0 in
pow_mpz_t_tmp
andpow_ZZ_tmp
where a ValueError? is raised. While I understand that Cython will ignore those errors, I think the right option is to open another ticket and modify what is done, rather than just removing the checks. With these changes it's easier for these functions to segfault on incorrect input.
Does this refer to how Cython used to work? Does it make sense to add an except NULL
to the function definition now? I stumbled upon this while working on #13591.
Positive review to the patch posted right when I write this.