Opened 10 years ago
Closed 9 years ago
#11342 closed enhancement (fixed)
Make getattr faster on parents and elements
Reported by: | SimonKing | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-4.7.2 |
Component: | performance | Keywords: | getattr parent element |
Cc: | jdemeyer | Merged in: | sage-4.7.2.alpha3 |
Authors: | Simon King, Volker Braun | Reviewers: | Jeroen Demeyer, Volker Braun, Simon King |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #9944 | Stopgaps: |
Description (last modified by )
If an attribute of a parent or element can not be found by inspecting the method resolution order, __getattr__
is invoked. It tries to obtain the attribute from the parent class or the element class of the category of the parent. If it can not be found, the attribute error is raised using a Cython function sage.structure.parent.raise_attribute_error
.
I see several ways to make both __getattr__
and sage.structure.parent.raise_attribute_error
a lot faster. Details will be given in the comments.
Because of one doctest in sage.categories.primer
, it is needed that the category of polynomial rings is properly initialised. Therefore:
Depends on #9944
Apply
Attachments (8)
Change History (69)
Changed 10 years ago by
comment:1 Changed 10 years ago by
- Dependencies set to #9944
- Status changed from new to needs_review
Here are my ideas to make attribute access faster.
1. Make raising an attribute error faster
It is very common that one tries to get an attribute, but it does not exist - attribute error. Since it occurs so frequently, it is essential that the error be raised as quickly as possible. Raising the error is therefore the job of a Cython function raise_attribute_error
in sage.structure.parent
. However, that function can be slightly improved.
1.a) The second argument to that function always is a string: The attribute name. Thus, why not explicitly state it in the Cython code?
1.b) It internally operates with the type of the first argument. So, why not explicitly state it in the Cython code?
1.c) The error message is formed from the name of the class and the name of the attribute. There is one complication: If one has an extension type then its name shall be prepended by its module name. There is a Cython function is_extension_type
for telling whether it is an extension type or not. But the function is small, and a tiny bit of time can be saved by doing the test inside raise_attribute_error
.
1.d) How shall one create the error message? In unpatched Sage, it is done by a formatted string that is formed by the class name (perhaps together with the name of the module of the class) and the attribute name. However, it turned out in some timings that I did that composing the error message by addition of strings is faster than the formatted string.
1.e) If one has an extension type then it turns out to be faster to get module name plus class name from the string representation of that type, rather than from addition of strings.
Hence, the new raise_attribute_error
looks like this:
cdef inline raise_attribute_error(self, str name): cdef type cls = type(self) cdef int dictoff try: dictoff = cls.__dictoffset__ except AttributeError: raise AttributeError, "'"+cls.__name__+"' object has no attribute '"+name+"'" if dictoff: raise AttributeError, "'"+cls.__name__+"' object has no attribute '"+name+"'" raise AttributeError, repr(cls)[6:-1] + " object has no attribute '"+name+"'"
Timings for raise_attribute_error
I compare the timings of unpatched and patched sage-4.7.alpha5, stating the patched version first. We have to distinguish extension types and others:
sage: cython(""" ....: from sage.structure.parent cimport raise_attribute_error ....: def test(self, name, long m): ....: cdef long i ....: for i from 0<=i<m: ....: try: ....: raise_attribute_error(self,name) ....: except AttributeError: ....: pass ....: """) sage: P.<a,b,c> = PolynomialRing(QQ) sage: P.__class__.__dictoffset__ 0 sage: QQ.__class__.__dictoffset__ 288 sage: %time test(P,'bla',10^7) CPU times: user 19.99 s, sys: 0.00 s, total: 19.99 s Wall time: 19.99 s # unpatched: # CPU times: user 21.28 s, sys: 0.00 s, total: 21.28 s # Wall time: 21.28 s sage: %time test(QQ,'bla',10^7) CPU times: user 17.66 s, sys: 0.01 s, total: 17.67 s Wall time: 17.67 s # unpatched: # CPU times: user 19.92 s, sys: 0.01 s, total: 19.93 s # Wall time: 19.93 s
2. Raise attribute errors earlier
An attribute of a parent can be obtained from the category only if its category is properly initialised. Otherwise an attribute error is raised. In order to test whether it is properly initialised, the method _is_category_initialized
is called. However, that method merely tests whether the cdef attribute _category
of the parent is None.
Hence, first suggestion: Directly test whether self._category
is None, if self
is a parent.
Up to now, an attribute of an element can be obtained from the category of its parent, even if the category of the parent is not properly initialised. I doubt that this is a good idea. In fact, there is only a single doctest error if one requests the parent's category to be initialised, and this error vanishes if #9944 is applied.
Hence, second suggestion: If self is an element, then try to get an attribute from the category only if self._parent._category is not None. These changes have a nice effect on the performance. Again, I state the timings for the patched version first.
Recall that by #10467, if the __getattr__
method of a parent (or element) is asked for a private attribute (one that starts with double underscore but does not end with an underscore), it immediately raises an error, since by convention private attributes can not be inherited from the category (or they wouldn't be private).
Hence, we have to consider the following cases:
- non-existing private attributes
sage: a = 1 sage: timeit('try: QQ.__bla\nexcept: pass') 625 loops, best of 3: 13.8 µs per loop # unpatched: 625 loops, best of 3: 15.3 µs per loop sage: timeit('try: a.__bla\nexcept: pass') 625 loops, best of 3: 13.3 µs per loop # unpatched: 625 loops, best of 3: 14 µs per loop
- non-existing non-private attributes
sage: timeit('try: QQ.__bla_\nexcept: pass') 625 loops, best of 3: 17.2 µs per loop # unpatched: 625 loops, best of 3: 23.4 µs per loop sage: timeit('try: a.__bla_\nexcept: pass') 625 loops, best of 3: 21.6 µs per loop # unpatched: 625 loops, best of 3: 28.2 µs per loop sage: timeit('try: QQ.bla\nexcept: pass') 625 loops, best of 3: 16.7 µs per loop # unpatched: 625 loops, best of 3: 22.9 µs per loop sage: timeit('try: a.bla\nexcept: pass') 625 loops, best of 3: 20.8 µs per loop # unpatched: 625 loops, best of 3: 27.2 µs per loop
- existing attributes inherited from the category
sage: timeit('try: QQ.sum\nexcept: pass',number=10^5) 100000 loops, best of 3: 740 ns per loop # unpatched: 625 loops, best of 3: 744 ns per loop sage: timeit('try: a.cartesian_product\nexcept: pass',number=10^5) 100000 loops, best of 3: 1.67 µs per loop # unpatched: 100000 loops, best of 3: 3.89 µs per loop
Long doctests pass for me - #9944 is required for one test in sage.category.primer
. So: Needs review!
Depends on #9944
Changed 10 years ago by
Make attribute access faster on elements and parents: Create the error message only when needed
comment:2 Changed 10 years ago by
- Description modified (diff)
Using an additional idea, I was able to improve the timings even further.
I have already indicated that the process of raising the attribute error actually is a bottle neck. To be precise: Most time is spent for creating the nicely formatted error message.
The point is: The typical fate of an AttributeError
is being caught. Hence, nobody will ever see the error message, except under very special circumstances.
Therefore I suggest that the error message will only be created if someone wants to see it. This is possible by introducing a new class sage.structure.parent.AttributeErrorMessage
. Its instances simply store a class cls
and an attribute name, and its string representation is the well-known error message stating that cls
has no attribute of the given name. This string representation is not created during initialisation but only when needed.
Hence, rather than calling raise_attribute_error(self,name)
, one would do raise AttributeError, AttributeErrorMessage(self,name)
. That is a lot faster:
sage: cython(""" ....: from sage.structure.parent import AttributeErrorMessage ....: def test(self,name,long m): ....: cdef long i ....: for i from 0<=i<m: ....: try: ....: raise AttributeError, AttributeErrorMessage(self,name) ....: except AttributeError: ....: pass ....: """) sage: P.<a,b,c> = PolynomialRing(QQ) sage: %time test(P,'bla',10^7) CPU times: user 3.10 s, sys: 0.00 s, total: 3.10 s Wall time: 3.10 s sage: %time test(QQ,'bla',10^7) CPU times: user 3.06 s, sys: 0.00 s, total: 3.06 s Wall time: 3.07 s
Compare with the timings for raise_attribute_error
in my previous comments!
Other than that, I added the specification that the attribute name is supposed to be a string:
def __getattr__(self, str name):
instead of
def __getattr__(self, name):
Updated timings
I am repeating the tests from my previous comments.
- non-existing private attributes
There is a clear improvement, even compared with the previous patch version.
sage: a = 1 sage: timeit('try: QQ.__bla\nexcept: pass') 625 loops, best of 3: 10.5 µs per loop sage: timeit('try: a.__bla\nexcept: pass') 625 loops, best of 3: 8.78 µs per loop sage: timeit('try: QQ.__bla_\nexcept: pass')
- non-existing non-private attributes
There is a clear improvement, even compared with the previous patch version.
sage: timeit('try: QQ.__bla_\nexcept: pass') 625 loops, best of 3: 14.1 µs per loop sage: timeit('try: a.__bla_\nexcept: pass') 625 loops, best of 3: 16.1 µs per loop sage: timeit('try: QQ.bla\nexcept: pass') 625 loops, best of 3: 13.8 µs per loop sage: timeit('try: a.bla\nexcept: pass') 625 loops, best of 3: 16.3 µs per loop
- existing attributes inherited from the category
sage: timeit('try: QQ.sum\nexcept: pass',number=10^5) 100000 loops, best of 3: 741 ns per loop sage: timeit('try: a.cartesian_product\nexcept: pass',number=10^5) 100000 loops, best of 3: 1.67 µs per loop
Note
Usually, errors are raised with an error message that is given by a string. It seems that raising it with an instance of AttributeErrorMessage
is fine as well - at least it does not seem to break existing tests. However, perhaps other people have objections.
Therefore I'm informing sage-devel, and I did not override the old patch but created a new one.
For the patchbot:
Depends on #9944
Apply [attachent:trac11342-AttributeErrorMessage.patch]
comment:3 follow-up: ↓ 4 Changed 10 years ago by
Hi Simon,
I didn't realize that it was part of your problem, that's why I only react now. It seems that your class AttributeErrorMessage
somehow reproduce the feature of LazyFormat? (see #11342). We probably should merge the two features. Note that mine is pure Python so that probably a little Cythonizing is needed if it fits you needs.
Florent
comment:4 in reply to: ↑ 3 Changed 10 years ago by
Replying to hivert:
I didn't realize that it was part of your problem, that's why I only react now. It seems that your class
AttributeErrorMessage
somehow reproduce the feature of LazyFormat? (see #11342).
You mean #8742.
Thank you for the pointer to lazy format strings!
comment:5 follow-up: ↓ 6 Changed 10 years ago by
I did the following test with lazy format strings:
sage: cython(""" ....: from sage.misc.lazy_format import LazyFormat ....: def test(self,name,long m): ....: s = LazyFormat("%s has no attribute %s") ....: cdef long i ....: for i from 0<=i<m: ....: try: ....: raise AttributeError, s%(self.__class__,name) ....: except AttributeError: ....: pass ....: """)
That string is a good approximation to what we really want, but the differences to the original error messages could only be resolved by a case distinction:
sage: s = LazyFormat("%s has no attribute %s") sage: raise AttributeError, s%(QQ.__class__,'bla') --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /home/king/<ipython console> in <module>() AttributeError: <class 'sage.rings.rational_field.RationalField_with_category'> has no attribute bla sage: QQ.bla ... AttributeError: 'RationalField_with_category' object has no attribute 'bla'
However, even if we do not insist on reproducing the exact same error message, lazy format strings are simply too slow.
sage: %time test(QQ,'bla',10^6) CPU times: user 16.12 s, sys: 0.01 s, total: 16.13 s Wall time: 16.12 s
So, it is nearly eight times slower than using raise_attribute_error
in unpatched sage, and about fifty times slower than using AttributeErrorMessage
, which was only 3 seconds for 10^7
(not 10^6
) iterations, and which does preserve the old error messages.
comment:6 in reply to: ↑ 5 ; follow-up: ↓ 8 Changed 10 years ago by
So, it is nearly eight times slower than using
raise_attribute_error
in unpatched sage, and about fifty times slower than usingAttributeErrorMessage
, which was only 3 seconds for10^7
(not10^6
) iterations, and which does preserve the old error messages.
As I already said: [It] is pure Python so that probably a little Cythonizing is needed if it fits you needs.
Plus compared to your code I have have to pay the extra cost of calling the operator __mod__
for %
when binding the string to its extra args. If you want speed from cython, it would be better to have this extra binding in a non standard def method so that you can bypass the call to python interpreter. Do you accept a little slow down (calling %) as a trade-off to avoid code and feature duplication ?
comment:7 follow-up: ↓ 9 Changed 10 years ago by
- Owner changed from tbd to (none)
Hi Simon
Some more info: According to prun most of the time is spend during the copy in
def __mod__(self, args): """ """ if hasattr(self, "_args"): # self is already bound... self = copy(self) self._args = args return self
For example
1 2.828 2.828 19.882 19.882 {_home_florent__sage_temp_popcorn_rouba_net_19739_tmp_0_spyx_0.test} 999999 2.429 0.000 15.759 0.000 copy.py:65(copy) 1000000 1.088 0.000 17.054 0.000 lazy_format.py:82(__mod__) 1999999 1.067 0.000 1.067 0.000 {hasattr}
I'm pretty sure that cythonizing properly this copy should give a large
speedup. So the question is: should we try to optimize LazyFormat
or do
you rather have a hand tuned err-message to ensure proper backward
compatibility ?
comment:8 in reply to: ↑ 6 Changed 10 years ago by
Replying to hivert:
As I already said: [It] is pure Python so that probably a little Cythonizing is needed if it fits you needs.
Plus compared to your code I have have to pay the extra cost of calling the operator
__mod__
for%
when binding the string to its extra args. If you want speed from cython, it would be better to have this extra binding in a non standard def method so that you can bypass the call to python interpreter. Do you accept a little slow down (calling %) as a trade-off to avoid code and feature duplication ?
I don't think that there is any code or feature duplication.
Certainly there is no code duplication.
What you can do with AttributeErrorMessage
can not directly be done with a format string (lazy or not), because you need to distinguish two cases. And what you can do with lazy format strings can not be done with an AttributeErrorMessage
, since the only thing that the new class can do is to print itself as an error message. So, there is no feature duplication either.
Moreover, this ticket explicitly is about speed. A slow down by an extra call to mod would thus not be acceptable, IMO. In particular if that call makes the attribute access slower than with unpatched sage.
comment:9 in reply to: ↑ 7 ; follow-up: ↓ 10 Changed 10 years ago by
Replying to hivert:
Some more info: According to prun most of the time is spend during the copy in [mod].
...
I'm pretty sure that cythonizing properly this copy should give a large speedup. So the question is: should we try to optimize
LazyFormat
or do you rather have a hand tuned err-message to ensure proper backward compatibility ?
Backward compatibility is one thing.
In addition I don't think that you can come up to speed, even with cython, if you need to copy strings etc. The point of AttributeErrorMessage
is to assign two attributes (without copying, so, it just sets two pointers to existing objects) and nothing else.
Of course, you may try to tune the lazy format strings so that they are fast enough for the application here. But for the moment, I really don't see why one should not introduce a very slim new class (it only has __init__
and __repr__
), if it is faster and offers backward compatibility for free.
comment:10 in reply to: ↑ 9 Changed 10 years ago by
In addition I don't think that you can come up to speed, even with cython, if you need to copy strings etc. The point of
AttributeErrorMessage
is to assign two attributes (without copying, so, it just sets two pointers to existing objects) and nothing else.
Please have a look at the code... I'm just copying the object itself if it is bound with two different objects. Now using copy.copy from python is always awfully slow. So I'm basically creating a new object as you have to do.
Of course, you may try to tune the lazy format strings so that they are fast enough for the application here. But for the moment, I really don't see why one should not introduce a very slim new class (it only has
__init__
and__repr__
), if it is faster and offers backward compatibility for free.
If you tell me you won't use it, I'm less motivated to tune it finely ;-)
comment:11 Changed 10 years ago by
- Description modified (diff)
comment:12 Changed 9 years ago by
Ping!
For the patchbot (since it doesn't read the ticket description):
Apply trac11342-AttributeErrorMessage?.patch
comment:13 Changed 9 years ago by
On June 27, the patchbot applied both patches, even though on June 26 I pointed out that only one patch has to be applied.
Let's ty again...
Apply trac11342-AttributeErrorMessage?.patch
comment:14 Changed 9 years ago by
apply only trac11342-AttributeErrorMessage?.patch
comment:15 follow-up: ↓ 16 Changed 9 years ago by
Stupid auto-wikification apply only trac11342-AttributeErrorMessage.patch
comment:16 in reply to: ↑ 15 Changed 9 years ago by
Replying to robertwb:
Stupid auto-wikification
Thank you for your effort and for renaming the patch!
Now, the patchbot has still trouble, namely when applying #9944. That ticket was merged in sage-4.7.1.alpha2, but apparently it can not be applied to sage-4.7.
Hélas.
For the record: It should work on top of sage-4.7.1.rc2.
Changed 9 years ago by
Only apply this patch. Make attribute access faster on elements and parents: Create the error message only when needed
comment:17 Changed 9 years ago by
- Description modified (diff)
I had to rebase the patch, since it only applied with fuzz 1 against sage-4.7.1.rc1.
Still needs review.
Apply trac11342-attribute_error_message.rebased.patch
comment:18 follow-up: ↓ 19 Changed 9 years ago by
Sage-4.7.2.alpha1 segfaults with the patch applied:
[vbraun@volker-laptop-two sage]$ sage -gdb ---------------------------------------------------------------------- | Sage Version 4.7.2.alpha1, Release Date: 2011-08-17 | | Type notebook() for the GUI, and license() for information. | ---------------------------------------------------------------------- ********************************************************************** * * * Warning: this is a prerelease version, and it may be unstable. * * * ********************************************************************** /home/vbraun/Sage/sage/local/bin/sage-ipython GNU gdb (GDB) Fedora (7.3-41.fc15) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/vbraun/opt/sage-4.7.2.alpha1/local/bin/python...done. [Thread debugging using libthread_db enabled] Python 2.6.4 (r264:75706, Aug 20 2011, 21:29:24) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Traceback (most recent call last): File "/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.16-gdb.py", line 59, in <module> from libstdcxx.v6.printers import register_libstdcxx_printers File "/usr/lib64/../share/gcc-4.6.0/python/libstdcxx/v6/printers.py", line 19, in <module> import itertools ImportError: No module named itertools Program received signal SIGSEGV, Segmentation fault. PyObject_GetAttr (v=0x0, name=0x1038378) at Objects/object.c:1174 1174 Objects/object.c: No such file or directory. in Objects/object.c Missing separate debuginfos, use: debuginfo-install atlas-3.8.3-18.fc14.x86_64 expat-2.0.1-11.fc15.x86_64 fontconfig-2.8.0-3.fc15.x86_64 glibc-2.14-5.x86_64 keyutils-libs-1.2-7.fc15.x86_64 krb5-libs-1.9.1-5.fc15.x86_64 libcom_err-1.41.14-2.fc15.x86_64 libgcc-4.6.0-10.fc15.x86_64 libselinux-2.0.99-4.fc15.x86_64 libstdc++-4.6.0-10.fc15.x86_64 openssl-1.0.0d-1.fc15.x86_64 (gdb) bt #0 PyObject_GetAttr (v=0x0, name=0x1038378) at Objects/object.c:1174 #1 0x00007fffea27c2e1 in __pyx_pf_4sage_9structure_7element_7Element_3__getattr__ ( __pyx_v_name=0x12b1530, __pyx_v_self=0x10eb418) at sage/structure/element.c:2795 #2 __pyx_tp_getattro_4sage_9structure_7element_Element (n=0x12b1530, o=0x10eb418) at sage/structure/element.c:23398 #3 __pyx_tp_getattro_4sage_9structure_7element_Element (o=0x10eb418, n=0x12b1530) at sage/structure/element.c:23394 #4 0x00007ffff7d197ba in builtin_hasattr (self=<optimized out>, args=<optimized out>) at Python/bltinmodule.c:880 #5 0x00007ffff7d225b5 in call_function (oparg=<optimized out>, pp_stack=0x7ffffffeccc0) at Python/ceval.c:3706 #6 PyEval_EvalFrameEx (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2389 #7 0x00007ffff7d242c9 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2968 #8 0x00007ffff7cb0596 in function_call (func=0x1c57230, arg=0x2299488, kw=0x0) at Objects/funcobject.c:524 #9 0x00007ffff7c8a763 in PyObject_Call (func=0x1c57230, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492 #10 0x00007ffff7c973bf in instancemethod_call (func=0x1c57230, arg=0x2299488, kw=0x0) at Objects/classobject.c:2579 #11 0x00007ffff7c8a763 in PyObject_Call (func=0x1e725a0, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492 #12 0x00007fffeab5dab6 in __pyx_f_4sage_9structure_10parent_old_6Parent__coerce_c ( __pyx_v_self=0x1ca5650, __pyx_v_x=0x10eb418, __pyx_skip_dispatch=<optimized out>) at sage/structure/parent_old.c:3711 #13 0x00007fffeab5660e in __pyx_f_4sage_9structure_10parent_old_6Parent_has_coerce_map_from_c_impl ( __pyx_v_self=0x1ca5650, __pyx_v_S=0x1089240) at sage/structure/parent_old.c:4914 #14 0x00007fffeab55964 in __pyx_pf_4sage_9structure_10parent_old_6Parent_11has_coerce_map_from_impl ( __pyx_v_self=0x1ca5650, __pyx_v_S=0x1089240) at sage/structure/parent_old.c:4802 #15 0x00007ffff7c8a763 in PyObject_Call (func=0x1f95d88, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492 #16 0x00007fffeab58e41 in __pyx_f_4sage_9structure_10parent_old_6Parent_has_coerce_map_from_c ( __pyx_v_self=0x1ca5650, __pyx_v_S=0x1089240, __pyx_skip_dispatch=<optimized out>) at sage/structure/parent_old.c:4659 #17 0x00007fffeab54737 in __pyx_f_4sage_9structure_10parent_old_6Parent_coerce_map_from_c_impl ( __pyx_v_self=0x1ca5650, __pyx_v_S=0x1089240) at sage/structure/parent_old.c:2638 #18 0x00007fffeab55f64 in __pyx_pf_4sage_9structure_10parent_old_6Parent_2coerce_map_from_impl ( __pyx_v_self=0x1ca5650, __pyx_v_S=0x1089240) at sage/structure/parent_old.c:2283 #19 0x00007ffff7c8a763 in PyObject_Call (func=0x1f83e18, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492 #20 0x00007fffeab5a2a2 in __pyx_f_4sage_9structure_10parent_old_6Parent_coerce_map_from_c ( __pyx_v_self=0x1ca5650, __pyx_v_S=0x1089240, __pyx_skip_dispatch=<optimized out>) at sage/structure/parent_old.c:1887
comment:19 in reply to: ↑ 18 Changed 9 years ago by
- Status changed from needs_review to needs_work
Replying to vbraun:
Sage-4.7.2.alpha1 segfaults with the patch applied:
What? Then why does it work (at least for me) with sage-4.7.1.rc1?
So, homework for me: Build sage-4.7.2.alpha1.
I just noticed that this patch comes after #9138 in my private patch queue. #9138 makes all rings use the category framework, and the __getattr__
method we are dealing with here is related with the category framework. Could you test whether the segfault vanishes if you first have #9138?
comment:20 Changed 9 years ago by
Looking at the debug information more closely: parent_old is mentioned. To me, that makes it rather likely that #9138 is to blame (or to praise) for the fact that "at home" it works for me.
comment:21 Changed 9 years ago by
I get the same segfault with #9138, too.
comment:22 follow-up: ↓ 23 Changed 9 years ago by
The segfault is in the last line; is _category
Null?
/* "sage/structure/element.pyx":328 * if (name.startswith('__') and not name.endswith('_')) or self._parent._category is None: * raise AttributeError, AttributeErrorMessage(self, name) * return getattr_from_other_class(self, self._parent._category.element_class, name) # <<<<<<<<<<<<<< * * def __dir__(self): */ __Pyx_XDECREF(__pyx_r); __pyx_t_2 = __Pyx_GetName(__pyx_m, __pyx_n_s_8); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 328; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __Pyx_GOTREF(__pyx_t_2); __pyx_t_1 = PyObject_GetAttr(((struct __pyx_obj_4sage_9structure_7element_Element *)__pyx_v_self)->_parent->__pyx_base._category, __pyx_n_s__element_class); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 328; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
comment:23 in reply to: ↑ 22 Changed 9 years ago by
Replying to vbraun:
The segfault is in the last line; is
_category
Null?...
- if (name.startswith('') and not name.endswith('_')) or self._parent._category is None:
The preceding line is supposed to test whether the category of self._parent is initialised. If a cdef attribute is not initialised then it is None. At least, so it is on all computers that I worked with.
- raise AttributeError?, AttributeErrorMessage?(self, name)
- return getattr_from_other_class(self, self._parent._category.element_class, name) # <<<<<<<<<<<<<<
And IF self._parent._category is not None then it has an element_class.
Very strange.
I just verified that on my machine and with sage-4.7.1.rc2 I do not get the segfault, even without #9138. So, I should definitely build sage-4.7.2.alpha1.
Cheers, Simon
comment:24 follow-up: ↓ 25 Changed 9 years ago by
I also get the segfault with sage-4.7.1.rc2...
comment:25 in reply to: ↑ 24 Changed 9 years ago by
Replying to vbraun:
I also get the segfault with sage-4.7.1.rc2...
What computer is it?
Could you insert a print statement, printing self._parent._category (or, at least printing the type of self._parent._category)? I that way, one could see what happens right in front of the segfault.
comment:26 Changed 9 years ago by
And I just noticed that I was working with sage-4.7.1.rc1, not rc2.
Anyway, I am building sage-4.7.2.alpha2 now. So, tomorrow I will probably know more...
comment:27 follow-up: ↓ 29 Changed 9 years ago by
I added some debug output and it dies here:
self = x^2 + 1 self._parent = Univariate Polynomial Ring in x over Rational Field self._parent._category = Category of euclidean domains self = x^2 + 1 self._parent = Univariate Polynomial Ring in x over Rational Field self._parent._category = Category of euclidean domains self = x^2 + 1 self._parent = Univariate Polynomial Ring in x over Rational Field self._parent._category = Category of euclidean domains self = x^2 + 1 self._parent = Univariate Polynomial Ring in x over Rational Field self._parent._category = Category of euclidean domains self = x^2 + 1 self._parent = Univariate Polynomial Ring in x over Rational Field self._parent._category = Category of euclidean domains self = 0 self._parent = None /home/vbraun/opt/sage-4.7.1.rc2/local/lib/libcsage.so(print_backtrace+0x31)[0x7fb2f72cb242] /home/vbraun/opt/sage-4.7.1.rc2/local/lib/libcsage.so(sigdie+0x14)[0x7fb2f72cb274] /home/vbraun/opt/sage-4.7.1.rc2/local/lib/libcsage.so(sage_signal_handler+0x20c)[0x7fb2f72caec2] /lib64/libpthread.so.0[0x389c20eef0]
comment:28 Changed 9 years ago by
PS: Its a Thinkpad W520 running Fedora 15
comment:29 in reply to: ↑ 27 Changed 9 years ago by
Replying to vbraun:
self = 0 self._parent = None
Aha! So, it can not do self._parent._category, since the parent is not assigned!
self is zero. But what zero? Integer? Polynomial? Can you print its type?
But in either case, one could insert a test "if self._parent is not None". Unfortunately, any additional test decreases performance...
comment:30 Changed 9 years ago by
PS: The likely problem is that some attribute is requested before doing Element.__init__(self,...)
, whatever self is.
comment:31 follow-up: ↓ 32 Changed 9 years ago by
Looks like pari is to blame:
self = x^2 + 1 type = <type 'sage.rings.polynomial.polynomial_rational_flint.Polynomial_rational_flint'> self._parent = Univariate Polynomial Ring in x over Rational Field type = <class 'sage.rings.polynomial.polynomial_ring.PolynomialRing_field_with_category'> self._parent._category = Category of euclidean domains self = x^2 + 1 type = <type 'sage.rings.polynomial.polynomial_rational_flint.Polynomial_rational_flint'> self._parent = Univariate Polynomial Ring in x over Rational Field type = <class 'sage.rings.polynomial.polynomial_ring.PolynomialRing_field_with_category'> self._parent._category = Category of euclidean domains self = 0 type = <type 'sage.libs.pari.gen.gen'> self._parent = None type = <type 'NoneType'> /home/vbraun/opt/sage-4.7.1.rc2/local/lib/libcsage.so(print_backtrace+0x31)[0x7fe0b9a3c242] /home/vbraun/opt/sage-4.7.1.rc2/local/lib/libcsage.so(sigdie+0x14)[0x7fe0b9a3c274] /home/vbraun/opt/sage-4.7.1.rc2/local/lib/libcsage.so(sage_signal_handler+0x20c)[0x7fe0b9a3bec2] /lib64/libpthread.so.0[0x389c20eef0]
comment:32 in reply to: ↑ 31 Changed 9 years ago by
Replying to vbraun:
Looks like pari is to blame:
Great! Concerning "blame": I have never user "hg blame". Do you know how it works? Perhaps it could help to find out what patches between sage-4.7.1.rc1 and sage-4.7.1.rc2 have altered the init method of pari_gen. Or perhaps it is easier to look up the list of patches merged into rc2.
comment:33 Changed 9 years ago by
I checked the list of merged patches, but it did not give me a clue. But perhaps hg blame knows better.
comment:34 Changed 9 years ago by
In my Sage version, the init method of an object such as pari(x)
looks as follows:
def __init__(self): self.b = 0 self._parent = P self._refers_to = {}
That is sage/libs/pari/gen.pyx at line 391. How does it look in your Sage version?
P should be defined as
cdef PariInstance pari_instance, P pari_instance = PariInstance(16000000, 500000) P = pari_instance # shorthand notation
at lines 176 ff.
Could you test whether P is None when it is used in the __init__
method (lines 391 ff)?
comment:35 Changed 9 years ago by
I have the same pari __init__
. There were no nontrivial changes to pari between rc1 and rc2, and I get the same segfault on rc1 as well.
This looks like some initialization race / module initialization ordering issue.
comment:36 follow-ups: ↓ 37 ↓ 39 Changed 9 years ago by
- Cc jdemeyer added
Notice how Pari's PariInstance.__init__
generates elements like PariInstance.PARI_ZERO
. But during the construction of pari_instance
, it can't set the parent to anything non-null. Cython actually starts with None
for all uninitialized variables, so this is why the first few pari gen
s have None
as parent.
comment:37 in reply to: ↑ 36 Changed 9 years ago by
Replying to vbraun:
Notice how Pari's
PariInstance.__init__
generates elements likePariInstance.PARI_ZERO
. But during the construction ofpari_instance
, it can't set the parent to anything non-null. Cython actually starts withNone
for all uninitialized variables, so this is why the first few parigen
s haveNone
as parent.
Maybe a solution could be to decouple the true initialization of PARI and the generation of elements like PARI_ZERO
. Initializing PARI_ZERO
should be done after PariInstance
's __init__()
. Or we could probably completely remove PARI_ZERO
, ..., PARI_TWO
. They are almost never used anyway.
comment:38 Changed 9 years ago by
- Description modified (diff)
With the attached patch the segfault does not occur any more. I also tested that this was the only element whose parent was None.
comment:39 in reply to: ↑ 36 Changed 9 years ago by
Replying to vbraun:
Notice how Pari's
PariInstance.__init__
generates elements likePariInstance.PARI_ZERO
. But during the construction ofpari_instance
, it can't set the parent to anything non-null.
Yes it can! If I understand correctly, you say that the pari instance creates some elements while its own initialisation is not complete yet. Apparently it is supposed that there exists exactly one pari instance. Hence, if that instance is initialised, and if some elements are created during initialisation, then one can certainly assign self
to the ._parent
attribute of these elements.
But, frankly, the __init__
in sage/libs/pari/gen.pyx at line 391 is very improper. I mean, there is that pari instance P -- all elements are supposed to have an init method accepting a an argument parent
.
Anyway. I guess the fix for that problem will take place in the initialisation of the pari instance.
By the way, building sage-4.7.2.alpha2 was shorter than I thought. Problem: I do not get the segfault! Is the segfault supposed to happen when Sage starts? Or is it in some test?
I see that Volker already has a patch. It is a shame that I can still not reproduce the error, but I'll have a look at the new patch.
comment:40 Changed 9 years ago by
- Status changed from needs_work to needs_review
Yep, your patch looks nice. Probably I would have done
self.PARI_ZERO = self.new_gen_noclear(gen_0) self.PARI_ZERO._parent = self
but I think that'll work as well.
Jeroen, are you able to reproduce the segfault when only my patch is applied? Then perhaps you could review Volker's patch. To me, it seems reasonable that it fixes the problem, but I can not reproduce the problem.
comment:41 Changed 9 years ago by
Fixing up the element's parent in the parent ctor works here, but in general will lead you to elements with incomplete parents until the parent ctor finishes. This might lead to weird errors if you do anything non-trivial with the elements. I think that, as a rule of thumb, parent ctors should never create elements.
Its an implementation detail whether you get the segfault or not. There is a C struct with an uninitialized pointer, and dereferencing it may or may not segfault.
comment:42 Changed 9 years ago by
- Reviewers set to Jeroen Demeyer
Volker, could you wrap the initialization of PARI_ZERO
,... with sig_on()
/sig_off()
, just like it was before in __init__()
?
comment:43 Changed 9 years ago by
I've added the sig_on/sig_off in the patch.
comment:44 Changed 9 years ago by
I'm still getting segfaults. Does make ptest
complete for you? I get the following doctest errors
sage -t -force_lib .7.2.alpha1/devel/sage/sage/rings/pari_ring.py # Killed/crashed sage -t -force_lib .7.2.alpha1/devel/sage/sage/crypto/classical_cipher.py # Killed/crashed sage -t -force_lib .7.2.alpha1/devel/sage/sage/algebras/quatalg/quaternion_algebra.py # Killed/crashed sage -t -force_lib .7.2.alpha1/devel/sage/sage/modular/abvar/abvar.py # Killed/crashed sage -t -force_lib .7.2.alpha1/devel/sage/sage/modular/quatalg/brandt.py # Killed/crashed sage -t -force_lib .7.2.alpha1/devel/sage/sage/schemes/elliptic_curves/heegner.py # Killed/crashed
The first one is this:
sage: R = PariRing() sage: f = R('x^3 + 1/2') sage: f.dumps() Program received signal SIGSEGV, Segmentation fault. PyObject_GetAttr (v=0x0, name=0x10393e8) at Objects/object.c:1174 1174 Objects/object.c: No such file or directory. in Objects/object.c Missing separate debuginfos, use: debuginfo-install atlas-3.8.3-18.fc14.x86_64 expat-2.0.1-11.fc15.x86_64 fontconfig-2.8.0-3.fc15.x86_64 glibc-2.14-5.x86_64 keyutils-libs-1.2-7.fc15.x86_64 krb5-libs-1.9.1-5.fc15.x86_64 libcom_err-1.41.14-2.fc15.x86_64 libgcc-4.6.0-10.fc15.x86_64 libgfortran-4.6.0-10.fc15.x86_64 libquadmath-4.6.0-10.fc15.x86_64 libselinux-2.0.99-4.fc15.x86_64 libstdc++-4.6.0-10.fc15.x86_64 nss-softokn-freebl-3.12.10-2.fc15.x86_64 openssl-1.0.0d-1.fc15.x86_64 (gdb) bt #0 PyObject_GetAttr (v=0x0, name=0x10393e8) at Objects/object.c:1174 #1 0x00007fffea27c2e1 in __pyx_pf_4sage_9structure_7element_7Element_3__getattr__ (__pyx_v_name=0x7ffff7bdfab0, __pyx_v_self=0x4aebeb0) at sage/structure/element.c:2795 #2 __pyx_tp_getattro_4sage_9structure_7element_Element (n=0x7ffff7bdfab0, o=0x4aebeb0) at sage/structure/element.c:23398 #3 __pyx_tp_getattro_4sage_9structure_7element_Element (o=0x4aebeb0, n=0x7ffff7bdfab0) at sage/structure/element.c:23394 #4 0x00007ffff7cc4a0d in PyObject_GetAttrString (name=<optimized out>, v=0x4aebeb0) at Objects/object.c:1138 #5 PyObject_GetAttrString (v=0x4aebeb0, name=<optimized out>) at Objects/object.c:1129 #6 0x00007ffff7cdd733 in reduce_2 (obj=0x4aebeb0) at Objects/typeobject.c:3183 #7 0x00007ffff7cde048 in _common_reduce (proto=2, self=0x4aebeb0) at Objects/typeobject.c:3327 #8 object_reduce_ex (self=0x4aebeb0, args=<optimized out>) at Objects/typeobject.c:3389 #9 0x00007ffff7c8a763 in PyObject_Call (func=0x4a991b8, arg=<optimized out>, kw=<optimized out>) at Objects/abstract.c:2492 #10 0x00007fffee745d16 in save (self=0x79d788, args=0x4aebeb0, pers_save=0) at /home/vbraun/opt/sage-4.7.2.alpha1/spkg/build/python-2.6.4.p11/src/Modules/cPickle.c:2648 #11 0x00007fffee7475b8 in dump (self=0x79d788, args=0x4aebeb0) at /home/vbraun/opt/sage-4.7.2.alpha1/spkg/build/python-2.6.4.p11/src/Modules/cPickle.c:2714
comment:45 follow-up: ↓ 46 Changed 9 years ago by
Ok I looked at sage/rings/pari_ring.py
and it makes my eyes bleed... The elements don't call RingElement.__init__
so their parent is not set. Also, should use UniqueRepresentation
. But is anybody using that module?
comment:46 in reply to: ↑ 45 Changed 9 years ago by
Replying to vbraun:
Ok I looked at
sage/rings/pari_ring.py
and it makes my eyes bleed... The elements don't callRingElement.__init__
so their parent is not set. Also, should useUniqueRepresentation
. But is anybody using that module?
I guess, implicitly we are using it all: As we have seen, some pari objects are generated when sage starts. And you are right, it is evident that sage/rings/pari_ring.py has been written long time ago...
Do you think we should attempt to bring it up to date here? Or do it on a different ticket and just use the work-around provided by trac_11342_fix_pari_initialization.patch?
If we fix it here: Did you already start with it? Otherwise, I'd do it myself.
comment:47 Changed 9 years ago by
Volker, I am not getting the segfault you reported.
With the current two patches, I obtain:
sage: R = PariRing() sage: loads(R.dumps()) == R True sage: loads(R.dumps()) is R # I want to fix that! False sage: f = R('x^3 + 1/2') sage: f.dumps() 'x\x9ck`J.NLO\xd5+\xca\xccK/\xd6+H,\xca\x8c\x071\xb9\x02\x80,\xaeBF\xcd\xc6B&\xbf\xdaB\xe6P\x8ex\x90H||E!\x0bDCNf\x12D\xbd^zj^|A%W\x01X\x07k(gE\x9c\xb1\x82\xb6\x82\xa1\xbeQkP![q[\x92\x1e\x00\xbc9 \xf7'
So, the question is whether we should fix it here, or open a new ticket and make it a dependency.
comment:48 Changed 9 years ago by
- Description modified (diff)
I have already produced a patch on the pari ring problem. Volker, does it make your segfault vanish? At least, I think that my patch improves the code quality of pari_ring.py.
comment:49 follow-up: ↓ 50 Changed 9 years ago by
Your patch fixes the pari_ring
segfaults and looks great!
I've fixed the crash in sage/crypto/classical_cipher.py
in the patch that I just posted.
The next crash, in sage/algebras/quatalg/quaternion_algebra.py
, is more tricky. Ideals in quaternion algebras don't have their parent set. The QuaternionFractionalIdeal_rational
class derives from sage.ring.ideal
which assumes it is an ideal in a commutative ring, see sage.ring.ideal_monoid
.
comment:50 in reply to: ↑ 49 Changed 9 years ago by
Replying to vbraun:
Your patch fixes the
pari_ring
segfaults and looks great!
Thank you!
I've fixed the crash in
sage/crypto/classical_cipher.py
in the patch that I just posted.
OK, I'll have a look (but the problem is that I can't verify that it fixes a segfault, because I don't get a segfault.
The next crash, in
sage/algebras/quatalg/quaternion_algebra.py
, is more tricky. Ideals in quaternion algebras don't have their parent set. TheQuaternionFractionalIdeal_rational
class derives fromsage.ring.ideal
which assumes it is an ideal in a commutative ring, seesage.ring.ideal_monoid
.
Too bad. There is #11068, which introduces proper support for one- and twosided ideals in non-commutative rings. But that comes much later in my patch chain...
Perhaps we can use a temporary workaround: __getattr__
could not only test whether self._parent._category
is None, but whether self._parent
is None.
Of course, any additional test will affect the performance. But perhaps it is still acceptable? I think that would be the easiest solution, so, let's do some tests with it.
comment:51 Changed 9 years ago by
The crypto fix looks nice. I didn't run the doc tests yet, but "what could possibly go wrong"?
I think it should be used even if the segfaults could be used in a different way. That's to say, I will add if self._parent is not None
to __getattr__
and run all tests with both your new and my to-be-created patch.
comment:52 Changed 9 years ago by
- Description modified (diff)
I have added a new patch. Since any element has a parent, I do actually not test whether self._parent is None
. Instead, I do cdef Parent P = self._parent or self.parent()
.
That works, because self.parent()
is supposed to return something (even in cases of improper elements as we have met them in Pari). Moreover, it is fast, since the slow call self.parent()
will not be used if self._parent
is not None.
Since the parent needs to be retrieved anyway, my new patch essentially preserves the performance. So, the additional test is essentially for free.
I suppose that my patch would suffice to fix the problem. However, Volker's crypto patch improves the code quality and thus should be included as well.
For me, the tests pass when adding Volker's and my patch. But, after all, I did not suffer from the segfault. So, please see if it's fixed!
Apply trac11342-attribute_error_message.rebased.patch, trac_11342_fix_pari_initialization.patch, trac_11342_fix_pari_ring.patch, trac_11342_crypto_fixes.patch, trac11342_test_parent_on_getattr.patch
comment:53 follow-up: ↓ 54 Changed 9 years ago by
Thats not good enough because e.g. the ideals in the quaternion algebra return None
as
parent()
, too. Without this patch:
sage: R = QuaternionAlgebra(-11,-1).maximal_order() sage: R.right_ideal(R.basis()) Fractional ideal (1/2 + 1/2*j, 1/2*i + 1/2*k, j, k) sage: R.unit_ideal().parent() is None True
and with this ticket it still segfaults.
comment:54 in reply to: ↑ 53 Changed 9 years ago by
Replying to vbraun:
Thats not good enough because e.g. the ideals in the quaternion algebra return
None
as
parent()
, too.
Too bad. Then I'll try the more radical solution and test whether P is None.
comment:55 Changed 9 years ago by
And, by the way, even my patch on ideals in non-commutative rings wouldn't solve the problems with Quaternion algebra, since #11068 is about ideals, not fractional ideals.
comment:56 Changed 9 years ago by
I just updated my last patch. It should be absolutely airtight, since it is now explicitly tested whether the parent is None.
The timings, which is the main point of this ticket, are still fine. They did almost not change compared with the old patches, but they are much better than with unpatched sage-4.7.2.alpha2:
- non-existing private attributes
sage: a = 1 sage: timeit('try: QQ.__bla\nexcept: pass') 625 loops, best of 3: 10.8 µs per loop #unpatched: 15 µs per loop sage: timeit('try: a.__bla\nexcept: pass') 625 loops, best of 3: 9.07 µs per loop #unpatched: 13.9 µs per loop sage: timeit('try: QQ.__bla_\nexcept: pass') 625 loops, best of 3: 14.3 µs per loop #unpatched: 22.8 µs per loop
- non-existing non-private attributes
sage: timeit('try: QQ.__bla_\nexcept: pass') 625 loops, best of 3: 14.2 µs per loop #unpatched: 23 µs per loop sage: timeit('try: a.__bla_\nexcept: pass') 625 loops, best of 3: 17.3 µs per loop #unpatched: 23.2 µs per loop sage: timeit('try: QQ.bla\nexcept: pass') 625 loops, best of 3: 13.9 µs per loop #unpatched: 22.6 µs per loop sage: timeit('try: a.bla\nexcept: pass') 625 loops, best of 3: 16.6 µs per loop #unpatched: 25.9 µs per loop
- existing attributes inherited from the category
sage: timeit('try: QQ.sum\nexcept: pass',number=10^5) 100000 loops, best of 3: 744 ns per loop #unpatched: 754 ns per loop sage: timeit('try: a.cartesian_product\nexcept: pass',number=10^5) 100000 loops, best of 3: 1.68 µs per loop #unpatched: 3.97 µs per loop
I hope that the segfaults are finally gone!!
Apply trac11342-attribute_error_message.rebased.patch, trac_11342_fix_pari_initialization.patch, trac_11342_fix_pari_ring.patch, trac_11342_crypto_fixes.patch, trac11342_test_parent_on_getattr.patch
comment:57 Changed 9 years ago by
comment:58 Changed 9 years ago by
- Reviewers changed from Jeroen Demeyer to Jeroen Demeyer, Volker Braun
- Status changed from needs_review to positive_review
We should probably merge this ticket first since it is more low-level. I'm happy with it, so positive review for everything except trac_11342_crypto_fixes.patch
and trac_11342_fix_pari_initialization.patch
(which I wrote). Simon and Jeroen already reviewed those somewhat trivial patches, so everything is good to go.
comment:59 Changed 9 years ago by
If it is needed (one could argue that your patches are small enough to be reviewer patches), I give your patches a positive review as well.
comment:60 Changed 9 years ago by
- Reviewers changed from Jeroen Demeyer, Volker Braun to Jeroen Demeyer, Volker Braun, Simon King
comment:61 Changed 9 years ago by
- Merged in set to sage-4.7.2.alpha3
- Resolution set to fixed
- Status changed from positive_review to closed
Make attribute access faster on elements and parents