Opened 9 years ago

Last modified 5 years ago

#11115 closed enhancement

Rewrite cached_method in Cython — at Version 41

Reported by: SimonKing Owned by: jason
Priority: major Milestone: sage-5.0
Component: misc Keywords: category cython cache
Cc: nthiery Merged in:
Authors: Simon King Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #9976 Stopgaps:

Description (last modified by SimonKing)

Broadening the original description of the ticket:

The aim is to provide a Cython version of the cached_method decorator. There are three benefits:

1) Speed (see timings in the comments)

2) Using cached methods in Cython code.

3) Parent and element methods of categories

Let me elaborate on the last point:

In order to make full use of the parent and element methods of a category, it should be possible to define a cached method in the category, so that any object of that category inherits it with caching.

Currently, it fails as follows:

sage: class MyCategory(Category):
....:     def super_categories(self):
....:         return [Objects()]
....:     class ParentMethods:
....:         @cached_method
....:         def f(self, x):
....:             return x^2
....:
sage: cython("""from sage.structure.parent cimport Parent
....: cdef class MyClass(Parent): pass
....: """)
sage: O = MyClass(category=MyCategory())
sage: O.f(4)
16
sage: O.f(x) is O.f(x)
False

So, the method is inherited, but not cached.

Depends on #9976

Apply trac11115-cached_cython.patch

Change History (42)

comment:1 follow-ups: Changed 9 years ago by SimonKing

The reason is as follows (compare #8611):

The cached method f has a getter method, that returns a CachedMethodCaller and tries to assign it to O as an attribute. That assignment fails. Hence, whenever f is called again, a new instance of the CachedMethodCaller is created. Note that before #8611, that was always the case (not very efficient).

Moreover, the cached method would try to cache its return values in a dictionary assigned to O - again, it fails. So, caching can not work.

My suggestion is as follows.

We want, in particular, that caching works for the base classes Element and Parent. I propose to provide them with a public cdef method __cached_methods (a dictionary), that is used by the getter method of the cached method to store the resulting CachedMethodCaller, if storing it as an attribute fails.

Concerning the cache for storing the return values of f: It can not be stored in a dict as an attribute of O. But by #8611, there is a dict O.f.cache that is used for storing anyway.

So, in a nutshell, it simply works (and I already have a patch that only remains to be documented).

I tried to extend that approach to SageObject (from which both Parent and Element inherit), but providing SageObject with cdef attributes will not work, since in some places there is a double inheritance, e.g., from SageObject and list.

comment:2 in reply to: ↑ 1 Changed 9 years ago by SimonKing

Replying to SimonKing:

So, in a nutshell, it simply works (and I already have a patch that only remains to be documented).

... and documentation is hard, in particular for building the reference manual. See #9976.

The idea is:

  • Modify doc/common/builder.py so that it uses the utilities from sage.misc.sageinspect rather than the utilities from the inspect module, and modify the utilities from sage.misc.sageinspect so that they work on classes and methods (currently, they only work on class instances). That's #9976.
  • Use Cython for cached functions and methods -- it relies on #9976, since otherwise the doc won't build correctly.

I guess I will be able to submit patches to both tickets tomorrow.

comment:3 Changed 9 years ago by SimonKing

  • Authors set to Simon King
  • Description modified (diff)
  • Status changed from new to needs_review

It took a bit longer than I originally expected, but I think it was worth it.

New Features

  • An instance of a class deriving from Parent or Element can inherit a cached_method from its category, without breaking the cache, even if it does not allow attribute assignment.
  • The cached_method decorator can, to some extent, be used in Cython code.
  • Cached_method is a lot faster now. In fact, using the cached_method decorator on a Python class is faster than a hand-written cache for a Python method, provided that the arguments are given by name and not by position.

Examples

Python

The following code defines a category, and a Parent class that has a method with a hand-written cache and a corresponding cached method inherited from the category:

from sage.all import cached_method, Category, Objects, Parent

class MyCategory(Category):
    def super_categories(self):
        return [Objects()]
    class ParentMethods:
        @cached_method
        def cached_by_category(self, x=100, y=-1):
            return x*y

class MyPythonClass(Parent):
    def __init__(self):
        self._cache = {}
        Parent.__init__(self, category=MyCategory())
    def cached_by_python(self, x=100, y=-1):
        try:
            return self._cache[x,y]
        except KeyError:
            out = self._cache[x,y] = x*y
        return out

We do some sanity tests, and then show that the cached method is faster than the hand-written cache, unless we provide arguments by name:

sage: O = MyPythonClass()
sage: O.cached_by_category() is O.cached_by_category(100) is O.cached_by_category(x=100) is O.cached_by_category(100,-1)
True
sage: O.cached_by_python(y=1) == O.cached_by_category(y=1)
True
sage: O.cached_by_python() == O.cached_by_category()
True
sage: O.cached_by_python() is O.cached_by_python(100) is O.cached_by_python(x=100) is O.cached_by_python(100,-1)
True

Here are timings for the hand-knitted cache:

sage: timeit("O.cached_by_python()", number=10^6)
1000000 loops, best of 3: 630 ns per loop
sage: timeit("O.cached_by_python(100)", number=10^6)
1000000 loops, best of 3: 970 ns per loop
sage: timeit("O.cached_by_python(y=-1)", number=10^6)
1000000 loops, best of 3: 1.1 µs per loop
sage: timeit("O.cached_by_python(100,-1)", number=10^6)
1000000 loops, best of 3: 1.31 µs per loop

and here are the corresponding timings for the cached method inherited from the category:

sage: timeit("O.cached_by_category()", number=10^6)
1000000 loops, best of 3: 314 ns per loop
sage: timeit("O.cached_by_category(100)", number=10^6)
1000000 loops, best of 3: 954 ns per loop
sage: timeit("O.cached_by_category(y=-1)", number=10^6)
1000000 loops, best of 3: 1.93 µs per loop
sage: timeit("O.cached_by_category(100,-1)", number=10^6)
1000000 loops, best of 3: 1.06 µs per loop

Cython

You can not use arbitrary decorators in Cython. But it is now possible to wrap a Cython function by the cached_method and assign it as a method - one needs to explicitly provide its name, though. In addition, we provide a hand-written cache programmed in Cython.

from sage.structure.parent cimport Parent
from sage.all import cached_method

cpdef test_func(self,x=100, y=-1):
    return x*y

cdef class MyCythonClass(Parent):
    cdef dict _cache
    def __init__(self, category):
        self._cache={}
        Parent.__init__(self,category=category)
    cached_by_decorator = cached_method(test_func, name="cached_by_decorator")
    cpdef cached_by_cython(self,x=100,y=-1):
        try:
            return self._cache[x,y]
        except KeyError:
            out = self._cache[x,y] = x*y
        return out

It is a Parent class, and thus it can inherit parent methods from a category. Without the patch, the cache of an inherited cached_method would break, but now it is fine:

sage: C = MyCythonClass(MyCategory())
sage: C.cached_by_category(y=-1) is C.cached_by_category(100,-1)
True
sage: C.cached_by_decorator(y=-1) is C.cached_by_decorator(100,-1)
True
sage: C.cached_by_decorator(y=-1) == C.cached_by_category(100,-1)
True

The trick is that I introduced an attribute __cached_methods for Parent and Element, in which a cached method can be stored. The cache is (since #8611) stored as an attribute of the bound cached method.

While it is nice that cached_method works at all in Cython, the performance is not as good as I wish. Here are the times for the hand-knitted cache written in Cython:

sage: timeit("C.cached_by_cython()", number=10^6)
1000000 loops, best of 3: 242 ns per loop
sage: timeit("C.cached_by_cython(100)", number=10^6)
1000000 loops, best of 3: 538 ns per loop
sage: timeit("C.cached_by_cython(y=-1)", number=10^6)
1000000 loops, best of 3: 750 ns per loop
sage: timeit("C.cached_by_cython(100,-1)", number=10^6)
1000000 loops, best of 3: 882 ns per loop

Here for the cached_method inherited from the category:

sage: timeit("C.cached_by_category()", number=10^6)
1000000 loops, best of 3: 754 ns per loop
sage: timeit("C.cached_by_category(100)", number=10^6)
1000000 loops, best of 3: 1.62 µs per loop
sage: timeit("C.cached_by_category(y=-1)", number=10^6)
1000000 loops, best of 3: 2.77 µs per loop
sage: timeit("C.cached_by_category(100,-1)", number=10^6)
1000000 loops, best of 3: 1.76 µs per loop

And here using the decorator in Cython code:

sage: timeit("C.cached_by_decorator()", number=10^6)
1000000 loops, best of 3: 421 ns per loop
sage: timeit("C.cached_by_decorator(100)", number=10^6)
1000000 loops, best of 3: 1.02 µs per loop
sage: timeit("C.cached_by_decorator(y=-1)", number=10^6)
1000000 loops, best of 3: 1.96 µs per loop
sage: timeit("C.cached_by_decorator(100,-1)", number=10^6)
1000000 loops, best of 3: 1.07 µs per loop

Conclusion

The patch provides a considerable speed-up in a case where cached_method used before, so that cached_method is not only convenient but efficient. Also, it enables to use cached_method in a broader context; here, it is not particularly efficient, but it is convenient.

We want that decorated (in particular, cached) methods appear nicely in introspection and in the reference manual. Therefore, I like to have:

Depends on #9976

comment:4 Changed 9 years ago by SimonKing

  • Description modified (diff)
  • Summary changed from Allow for extension classes to inherit cached methods from their category to Rewrite cached_method in Cython

comment:5 in reply to: ↑ 1 ; follow-up: Changed 9 years ago by novoselt

Replying to SimonKing:

I tried to extend that approach to SageObject (from which both Parent and Element inherit), but providing SageObject with cdef attributes will not work, since in some places there is a double inheritance, e.g., from SageObject and list.

What exactly is the problem with double inheritance? It would be nice if all SageObjects had a fast uniform cache...

comment:6 in reply to: ↑ 5 Changed 9 years ago by SimonKing

Replying to novoselt:

... What exactly is the problem with double inheritance? It would be nice if all SageObjects had a fast uniform cache...

The problem is that you can not have double inheritance from extension classes if Python does not know whether the underlying data structures are compatible. So, if you have

cdef class SageObject:
    pass

(which is currently the case) then it is fine, such as here:

sage: cython('cdef class MySimpleObject: pass')
sage: class C(list,MySimpleObject): pass
....: 

But you get an error if you want to put more structure in it

sage: cython('''cdef class MyObject:
....:     cdef dict __cached_methods
....: ''')
sage: class C(list,MyObject): pass
....: 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/mnt/local/king/SAGE/broken/devel/sage-main/<ipython console> in <module>()

TypeError: Error when calling the metaclass bases
    multiple bases have instance lay-out conflict

comment:7 Changed 9 years ago by SimonKing

Here is another data point:

sage: class MyClass:
....:     def __init__(self, v):
....:         self.__v = v
....:     def v(self):
....:         return self.__v
....:     @cached_method
....:     def cached_v(self):
....:         return self.__v
....:     def derived_from_underscore(self):
....:         x = self.__v^2
....:     def derived_from_cache(self):
....:         x = self.cached_v()^2
....:     
sage: O = MyClass(100)
sage: O.v()
100
sage: O.cached_v()
100
sage: timeit('O.v()')
625 loops, best of 3: 599 ns per loop
sage: timeit('O.cached_v()')
625 loops, best of 3: 425 ns per loop

There are methods that are frequently called and just return a double-underscore attribute. It would be faster to achieve the same with a cached method! I guess this is because of the name mangling that takes place for double-underscore attributes.

However, that only holds when calling a method. Inside of a method, the double-underscore seems to be quicker:

sage: timeit('O.derived_from_underscore()')
625 loops, best of 3: 1.65 µs per loop
sage: timeit('O.derived_from_cache()')
625 loops, best of 3: 1.98 µs per loop

I think it would be worth trying to use this trick: Little methods returning a double-underscore attribute appear all over the place. See #9138 for one example.

comment:8 Changed 9 years ago by SimonKing

  • Status changed from needs_review to needs_work

There is one problem: The startup time increases.

I suspect this is because I extended the __getattr__ methods of parents and elements, so that it became slower -- slightly, I thought, but apparently noticeable. The original reason for doing so was an attempt to make a cached method available by attribute access, rather than by means of the __get__ methods of the CachedMethod class. I have to think of something better.

comment:9 Changed 9 years ago by SimonKing

I did some tests. I reverted the __getattr__ methods, so that the only change was that CachedMethod was not a Python but a Cython class. But I still saw an increased startup time.

That is strange. Is it the case that loading a Cython class takes longer than loading a Python class.

comment:10 Changed 9 years ago by SimonKing

One more idea: Currently, a cached method will modify its own __doc__ attribute, and some care is taken that it gets the correct one. Is it really needed to store the documentation? After all, it can easily be retrieved from the wrapped function or method!

Certainly it does not happen very frequently that the documentation of a method (cached or not) is requested: It is when the documentation is built, and it is when the user wants to learn something.

So, computing and storing the documentation during initialisation of a cached method may be a waste of time and also a waste of memory. Instead, I suggest that the documentation is only computed in the method self._sage_doc_(). I am now testing if that helps, startuptimewise.

comment:11 Changed 9 years ago by SimonKing

  • Status changed from needs_work to needs_review

I think the problem with the startup time is solved. Indeed, it turned out that the time was wasted by creating doc strings for cached methods.

Next, I'll do some more tests. If it can be confirmed that a Python method returning a double-underscore attribute is slower than a cached_method that does the same, then I will try to speed Sage up in general, by using cached methods extensively. I wonder what will come out...

Still:

Depends on #9976

comment:12 Changed 9 years ago by SimonKing

In order to be on the safe side, I made some timings for the patches from #9138 (on top of #9944), and I'd like to repeat these timings here, with this patch applied additionally.

$ ./sage -startuptime
...
== Slowest (including children) ==
1.244 sage.all (None)
0.280 sage.schemes.all (sage.all)
0.243 sage.misc.all (sage.all)
0.162 elliptic_curves.all (sage.schemes.all)
...

So, there is no slow-down, compared with the timings from an unpatched Sage on the same machine, as reported on #9138.

Here is an arithmetical example:

sage: B.<t> = PolynomialRing(Integers(125))
sage: R = monsky_washnitzer.SpecialCubicQuotientRing(t^3 - t + B(1/4))
sage: x, T = R.gens()
sage: timeit('x*T')
625 loops, best of 3: 893 µs per loop

That is about the same as with the patches from #9944 and #9138, but slower than with an unpatched Sage.

I found that one can use the improved cached_methods to considerably speed up the arithmetics. Namely, in the multiplication, the method modulus() from sage.rings.polynomial.polynomial_ring is called repeatedly. The modulus() method returns ´self.basering().characteristic()`. When I put that method under a cached_method decorator, the timing looks much better:

sage: B.<t> = PolynomialRing(Integers(125))
sage: R = monsky_washnitzer.SpecialCubicQuotientRing(t^3 - t + B(1/4))
sage: x, T = R.gens()
sage: timeit('x*T')
625 loops, best of 3: 253 µs per loop

That is much faster than without patches, which is 501 µs!!.

Currently, I try to find some places where using a cached_method might be beneficial. As I have demonstrated in previous posts, there are chances to speed methods up that do nothing more but "return self.__some_argument".

But independent of that possible second patch, I think the first patch can be reviewed.

comment:13 follow-up: Changed 9 years ago by SimonKing

I improved the performance of accessing the cache even further.

It is quite common to have methods that do not take arguments but whose return value should be cached. Typical example is the set of generators of a multivariate polynomial ring. Since sage-4.7.alpha5, it is a cached method.

Obviously, if a method takes no arguments, then caching the single value is much easier than storing the results in a dictionary for different arguments. I implemented this special case in a class CachedMethodCallerNoArgs. It is automatically used if @cached_method is a applied to a method without arguments. In particular, one does not need to do changes in the code.

Here is the performance. Bot I.groebner_basis and I.gens are cached methods (you need to take sage-4.7.alpha5 for the example):

sage: P.<a,b,c,d> = QQ[]
sage: I = P*[a,b]
sage: I.gens()
[a, b]
sage: I.groebner_basis()
[a, b]
sage: type(I.gens)
<type 'sage.misc.cachefunc.CachedMethodCallerNoArgs'>
sage: type(I.groebner_basis)
<type 'sage.misc.cachefunc.CachedMethodCaller'>
sage: timeit('I.gens()',number=10^6)
1000000 loops, best of 3: 170 ns per loop
sage: timeit('I.groebner_basis()',number=10^6)
1000000 loops, best of 3: 250 ns per loop

That is much faster than with an unpatched sage-4.7.alpha5:

sage: P.<a,b,c,d> = QQ[]
sage: I = P*[a,b]
sage: I.groebner_basis()
[a, b]
sage: I.gens()
[a, b]
sage: timeit('I.gens()',number=10^6)
1000000 loops, best of 3: 699 ns per loop
sage: timeit('I.groebner_basis()',number=10^6)
1000000 loops, best of 3: 746 ns per loop

To give you an idea of how fast 170 ns or 250 ns are:

sage: class MyClass:
....:     def __init__(self):
....:         self._v = 10
....:         self.__v = 10
....:     def m0(self):
....:         return 10
....:     def m1(self):
....:         return self._v
....:     def m2(self):
....:         return self.__v
....:     
sage: O = MyClass()
sage: timeit('O.m0()')
625 loops, best of 3: 1.01 µs per loop
sage: timeit('O.m1()')
625 loops, best of 3: 622 ns per loop
sage: timeit('O.m2()')
625 loops, best of 3: 621 ns per loop

Note that the cache is pickled -- see examples in the doc strings.

There is one further extension. There is a class sage.misc.cachefunc.ClearCacheOnPickle, whose purpose is quite nice: If you have a class that inherits from clearCacheOnPickle then the cached values will not be pickled.

That is to say: Let X be an instance of ClearCacheOnPickle? and assume that its pickling is implemented using the __get_newargs__ and __get_state__ mantra. Then, the cache of any cached method of loads(dumps(X)) will be empty (but of course the cache of X is preserved).

The idea existed, but it did not work for me. So, I re-implemented it essentially from scratch, using the original ideas. Also, I provided doc tests for ClearCacheOnPickle; there had been none before.

On top of sage-4.7.alpha5, I have the patches from #10296, #9944, #9138 and #9976, and then all doc tests pass with the patch from here. But there should only be one dependency, namely of #9976, which provides introspection to interactive Cython code.

Depends on #9976

comment:14 Changed 9 years ago by SimonKing

I just updated the patch, because I noticed the following: If you have an old pickle of an object with a cached method then of course the pickle will contain the cached value. If you unpickle then it may be that an old CachedMethodCaller becomes a new CachedMethodCallerNoArgs.

It would thus be nice if CachedMethodCallerNoArgs could retrieve the cache value of an old pickled CachedMethodCaller. This is implemented in the new patch.

What I did to test it:

  1. Without the patch applied, do
    sage: P.<a,b,c,d> = QQ[]
    sage: I = P*[a,b]
    sage: I.gens()  # this is in order to fill the cache - it is a CachedMethodCaller
    sage: save(I,'pickletest')
    
  1. With the patch applied, do
    sage: I = load('pickletest')
    sage: I.gens.cache   # this is now a CachedMethodCallerNoArgs
    [a, b]
    

So, the cache is already filled after unpickling.

comment:15 Changed 9 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Wait for #9138

While trying to fix doc tests, I noticed that there are too many non-initialised rings left that are not covered by the patch from #9138. I think it is more clean to fix it there, not here. Since #9138 needs work, this needs work as well.

comment:16 in reply to: ↑ 13 ; follow-ups: Changed 9 years ago by nthiery

Replying to SimonKing:

Obviously, if a method takes no arguments, then caching the single value is much easier than storing the results in a dictionary for different arguments. I implemented this special case in a class CachedMethodCallerNoArgs. It is automatically used if @cached_method is a applied to a method without arguments. In particular, one does not need to do changes in the code.

Yeah! I have been dreaming of this performance improvement since 2008!

Altogether, my only reluctance on this patch is about the addition of an attribute to all Sage elements. This sounds like a non trivial increase of the memory footprint for small objects like finite field elements. I don't have an opinion about this myself, but this definitely requires a discussion on sage-devel.

Parents are big objects, so this is not an issue for those.

Simon: you are doing way too much good stuff for me to follow! Honestly, I feel a bit lost about where to start. Do you mind making a prioritized list of patches that need review, say on the category roadmap [1]?

Thanks!

Cheers,

Nicolas

[1] http://trac.sagemath.org/sage_trac/wiki/CategoriesRoadMap

comment:17 in reply to: ↑ 16 ; follow-up: Changed 9 years ago by SimonKing

Replying to nthiery:

Replying to SimonKing: Altogether, my only reluctance on this patch is about the addition of an attribute to all Sage elements.

I did that for symmetry: Parents and Elements both accept cached methods. Cached methods only work for an object O if it either accepts attribute assignment, or provides a special location that is inspected by __getattr__. Otherwise, you need to reconstruct the CachedMethodCaller over and over again, which is a waste of time and also means that, e.g., "R.zero_ideal" would always be a different instance. Thus, "R.zero_ideal.cache" (which then also can not be stored as an attribute of R) is always different as well: The cache would break.

But of course, we could argue that cached methods are not of central importance for elements. So, we may decide to restrict support of cached methods to those elements that allow for attribute assignment. Then, we could remove the new __cached_methods attribute from sage.structure.element.Element. I could live with that restricted use of cached methods for elements.

This sounds like a non trivial increase of the memory footprint for small objects like finite field elements.

If no cached methods are used (more precisely: if they are not called), then __cached_methods should just be None (i.e., nothing more but a pointer, if I understand correctly). So, that's one word per object. If you do call cached methods then of course you need to store them somewhere.

Simon: you are doing way too much good stuff for me to follow! Honestly, I feel a bit lost about where to start. Do you mind making a prioritized list of patches that need review, say on the category roadmap [1]?

My main aim is the letterplace wrapper #7797.

It needs support for ideals in non-commutative rings. That is #11068.

Since there are rings that do not inherit from sage.rings.ring.Ring, we need to move (or duplicate?) methods from there in sage.categories.rings.Rings.ParentMethods?. That includes cached methods -- but there are rings that do not allow attribute assignment. This is why I originally opened this patch, #11115 (in the beginning, it was not so much about performance).

In order to have cached methods (in particular the argspec) appear nicely in the documentation, we need #9976, which already has a positive review.

Moving stuff from sage.rings.ring.Ring to sage.categories.rings.Rings.ParentMethods? instead of duplicating can only work if all rings use the category framework. That's the purpose of #9138, with a small part already done in #9944.

So, my personal roadmap is simply:

  #9944            #9976
  #9138           #11115
         #11068
         #7797

After that, I can finally start with my main project: Compute Ext algebras for finite dimensional path algebra quotients.

comment:18 in reply to: ↑ 16 ; follow-up: Changed 9 years ago by SimonKing

Replying to nthiery:

[1] http://trac.sagemath.org/sage_trac/wiki/CategoriesRoadMap

I inserted my favourite tickets into the section "Later patches related with or using categories:" (renaming that section, since not all of them use categories, but all of them relate with the category framework).

The tickets depend on each other, thus, the priorities are implicit.

comment:19 in reply to: ↑ 17 Changed 9 years ago by nthiery

Replying to SimonKing:

Replying to nthiery:

Replying to SimonKing: Altogether, my only reluctance on this patch is about the addition of an attribute to all Sage elements.

I did that for symmetry: ...

Yes, I definitely see the motivations and benefits! I am just saying that adding one word to all elements is not a light decision that we can take unilaterally.

Cheers,

Nicolas

comment:20 in reply to: ↑ 18 ; follow-up: Changed 9 years ago by nthiery

Replying to SimonKing:

Replying to nthiery:

[1] http://trac.sagemath.org/sage_trac/wiki/CategoriesRoadMap

I inserted my favourite tickets into the section "Later patches related with or using categories:" (renaming that section, since not all of them use categories, but all of them relate with the category framework).

The tickets depend on each other, thus, the priorities are implicit.

Thanks! By the way, did you include there all the patches about morphisms?

After that, I can finally start with my main project: Compute Ext algebras for finite dimensional path algebra quotients.

:-)

Cheers,

Nicolas

comment:21 in reply to: ↑ 20 Changed 9 years ago by SimonKing

Replying to nthiery:

Thanks! By the way, did you include there all the patches about morphisms?

I completely forgot about them, and probably I would not be able to work of them in the near future (1 month or so).

comment:22 Changed 9 years ago by SimonKing

  • Status changed from needs_work to needs_info
  • Work issues Wait for #9138 deleted

I am about to ask sage-devel about introducing an additional attribute to elements. #9138 should be fine now, so, I change the status from "needs work" to "needs info" (from sage-devel).

Depends on #9976

comment:23 Changed 9 years ago by SimonKing

Unfortunately, there was no reaction from sage-devel, and my question has already disappeared from the first screen.

I see an alternative to a new attribute __cached_methods for all elements: Since that attribute will only be used if the element does not allow attribute assignment, and since only few element classes use cached methods, it might be a good compromise to introduce the __cached_methods attribute only in those derived classes that really need it.

Hence, you would write

cdef class MyElementClass(Element):
    cdef public dict __cached_methods
    cdef ...
    def __init__(...)

if MyElementClass does not accept attribute assignment but is supposed to inherit cached methods from the category framework.

I do suggest to keep the additional attribute for sage.structure.parent.Parent, though. As you said, parents are big and rare enough that an additional attribute is no real burden.

comment:24 Changed 9 years ago by SimonKing

It occured to me that I did not yet provide an example to show cached methods for elements that do not allow attribute assignment:

sage: cython("""from sage.structure.element cimport Element
cdef class MyElement(Element):
    cdef object x
    def __init__(self,P,x):
        self.x=x
        Element.__init__(self,P)
    def __neg__(self):
        return MyElement(self.parent(),-self.x)
    def _repr_(self):
        return '<%s>'%self.x
    def raw_test(self):
        return -self
""")
sage: class MyCategory(Category):
    @cached_method
    def super_categories(self):
        return [Objects()]
    class ElementMethods:
        @cached_method
        def test_cache(self):
            return -self
sage: class MyParent(Parent): pass
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: e = MyElement(P,5)
sage: e
<5>
sage: -e
<-5>
# This is inherited from the category, ...
sage: e.test_cache()
<-5>
# ... and the cache works, ...
sage: e.test_cache() is e.test_cache()
True
# ... even though you can't set attributes:
sage: e.bla = 1
Traceback (most recent call last):
...
AttributeError: '_mnt_local_king__sage_temp_mpc622_25602_tmp_1_spyx_0.MyElement' object has no attribute 'bla'
# By lack of "setattr", the cache is not as fast as it could be, ...
sage: timeit('e.test_cache()')
625 loops, best of 3: 1.62 µs per loop
# ... but it is faster than to create a new element
sage: timeit('e.raw_test()')
625 loops, best of 3: 1.78 µs per loop

At sage-devel, Martin Raum suggested to time element creation (if I understood his suggestion correctly).

I found without patch:

sage: K = GF(101)
sage: get_memory_usage()
839.96484375
sage: %time for i in xrange(10^7): a = K(i)
CPU times: user 25.19 s, sys: 0.01 s, total: 25.21 s
Wall time: 25.20 s
sage: get_memory_usage()
839.96484375 

and with patch

sage: K = GF(101)
sage: get_memory_usage()
841.96875
# show that the new attribute is there, but it is non-initialized
sage: a=K(4)
sage: print a.__cached_methods
None
sage: %time for i in xrange(10^7): a = K(i)
CPU times: user 24.46 s, sys: 0.00 s, total: 24.46 s
Wall time: 24.46 s
sage: get_memory_usage()
841.96875 

So the time actually improved with the patch, and the memory usage at starting sage increased by as much as 0.24% - I hope that can be afforded.

comment:25 Changed 9 years ago by SimonKing

  • Status changed from needs_info to needs_review

There was rather little feedback on sage-devel.

The example above shows that the increase of memory consumption - at least directly after starting sage - is certainly tolerable. The performance gain is obvious.

There remains the question what we shall do with elements versus element class of a category. Options:

1) Do what the patch suggest. Then, every instance of Element can inherit a cached method from the element class of a category, regardless whether it accepts attribute assignment or not. The cached method will be more efficient if you can assign attributes, but at least the cache won't break.

2) Do not support the inheritance of cached methods from the category to elements that don't support attribute assignment.

3) A combination of 1) and 2): If the user defines an attribute __cached_methods for its element, then it could inherit cached methods from the element class of a category.

I already stated what I'd prefer.

By the way, I just noticed that elements can currently (with or without the patch) not inherit a cached_in_parent_method. Reason:

sage: class MyCategory(Category):
....:     def super_categories(self):
....:         return [Objects()]
....:     class ElementMethods:
....:         @cached_in_parent_method
....:         def test_cache(self):
....:             return -self
....:         
sage: C = MyCategory()
sage: C.element_class.test_cache
Traceback (most recent call last):
...
AttributeError: 'NoneType' object has no attribute 'parent'

I think I could make it work. Should I do that here, or better on a different ticket (which depends on this)?

comment:26 Changed 9 years ago by SimonKing

  • Status changed from needs_review to needs_work

It turned out that the change is very small. So, I'll add a doc test and then submit a new patch here.

comment:27 Changed 9 years ago by SimonKing

  • Status changed from needs_work to needs_review

The new patch is posted. Apart from fixing the categorical trouble with cached_in_parent methods, it turns several doc strings into raw strings (they contain backslashes). Please look at the documentation (I'll build it as well)!

Here is the new doc test, demonstrating that cached_in_parent works with categories:

sage: cython_code = ["from sage.structure.parent cimport Parent",
... "from sage.structure.element cimport Element",
... "from sage.all import Category, cached_in_parent_method",
... "cdef class MyElement(Element):",
... "    cdef object x",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def __cmp__(self, other):",
... "        return cmp(self.x, (<MyElement>other).x)",
... "cdef class MyParent(Parent): pass",
... "class MyCategory(Category):",
... "    def super_categories(self):",
... "        return [Objects()]",
... "    class ElementMethods:",
... "        @cached_in_parent_method",
... "        def test_cache(self):",
... "            return -self"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: e1 = MyElement(P,5)
sage: e2 = MyElement(P,5)
sage: e3 = MyElement(P,6)

We verify that attribute assignment does not work:

sage: e1.bla = 1
Traceback (most recent call last):
...
AttributeError: '...MyElement' object has no attribute 'bla'
sage: P.bla = 1
Traceback (most recent call last):
...
AttributeError: '...MyParent' object has no attribute 'bla'

Nevertheless, the cached method works, and it returns identical output for equal elements, as expected:

sage: e1.test_cache()
<-5>
sage: e1 is e2
False
sage: e1 == e2
True
sage: e2.test_cache() is e1.test_cache()
True
sage: e1 == e3
False
sage: e2.test_cache() == e3.test_cache()
False

Back to the discussion about attributes: It is due to the __cached_methods attribute of P that the cache does not break in the example. It is not needed that the elements have that attribute.

Depends on #9976

Changed 9 years ago by SimonKing

Allow the use of cached_in_parent methods in the category framework

comment:28 follow-up: Changed 9 years ago by novoselt

I think that the current behaviour of the patch regarding cache storage is just fine. Getting more memory tends to be much easier and cheaper than getting a faster CPU.

I was avoiding using @cached_method before due to its issues with speed and documentation, but have started using it recently and it is so convenient! The more general, convenient, and well-behaved these decorators are, the more developers will adapt them.

While in principle it is not too much of a hassle to insert an extra attribute into your new class, it is quite annoying to always remember that if you decide to put a decorator on some method you have also to check if there is that extra attribute already defined or not. And as I understand, things don't break if you forget to do it, they just work slower than they should.

comment:29 in reply to: ↑ 28 ; follow-up: Changed 9 years ago by SimonKing

Hi Andrey,

Replying to novoselt:

And as I understand, things don't break if you forget to do it, they just work slower than they should.

What scenario does your statement refer to? (1) The status quo, (2) the patch as it is, or (3) the patch without sage.structure.element.Element.__cached_methods?

Here is my view on these three:

(1) Cached methods have been improved in #8611, but they are still not fast enough, for my taste. Time and memory is wasted by assigning a __doc__ string to each instance of a cached method (that is noticeable in the startup time). It can only be used in Python code. If a parent or an element does not accept attribute assignment then it can inherit a cached_method from the category -- but the cache breaks and there is a huge overhead. cached_in_parent_method can not be inherited from the category at all.

(2) cached_method can compete with a hand-written cache in Python, speed-wise, provided that attribute assignment is possible. __doc__ is now replaced by _sage_doc_, which is used by sage's introspection anyway and computes the docstring only when requested. It can, to some extent, be used in Cython code. If a parent or an element does not accept attribute assignment then it can inherit a cached_method from the category: The return values are cached ("things don't break), but there is an overhead (though much less than in (1)). The price to pay is 0.24% more memory consumption at startup. cached_in_parent_method can be inherited from the category.

(3) (Hypothetical) Cached methods are supported for elements only if either the element allows attribute assignment, or the user did not forget to provide it with a public attribute __cached_methods. If neither of these conditions hold then things behave as in (1), for elements. In particular, for an element that does not satisfy the conditions, the cache of cached_method will break and will have a huge overhead. However, cached_in_parent_method is fully supported (that is since the cache is in the parent, and parents have __cached_methods in scenario (3)).

comment:30 Changed 9 years ago by SimonKing

PS: Here is another test.

With sage-4.7.alpha5 and no patches applied:

sage: get_memory_usage()
839.96484375
sage: K = GF(next_prime(10^6))
sage: %time L = [K(i) for i in xrange(10^6)]
CPU times: user 4.00 s, sys: 0.06 s, total: 4.05 s
Wall time: 4.06 s
sage: get_memory_usage()
929.83203125
sage: 929.83203125 - 839.96484375
89.8671875000000

With sage-4.7.alpha5 and the patches applied:

sage: get_memory_usage()
841.9921875
sage: K = GF(next_prime(10^6))
sage: %time L = [K(i) for i in xrange(10^6)]
CPU times: user 4.25 s, sys: 0.03 s, total: 4.28 s
Wall time: 4.28 s
sage: get_memory_usage()
938.6953125
sage: 938.6953125 - 841.9921875
96.7031250000000

I don't know why it became slower - as I mentioned above, the element creation in GF(101) became faster. However, when many elements are created then the memory consumption increases by something like 7.5%:

sage: (96.7031250000000-89.8671875000000)/89.8671875000000
0.0760671129270625

That may be too much.

comment:31 follow-up: Changed 9 years ago by SimonKing

Apparently the different tendency of the test comes from choosing a different field size. If I return to GF(101), I get

sage-4.7.alpha5 without patches:

sage: get_memory_usage()
839.96484375
sage: K = GF(101)
sage: %time L = [K(i) for i in xrange(10^6)]
CPU times: user 2.75 s, sys: 0.00 s, total: 2.76 s
Wall time: 2.76 s
sage: get_memory_usage()
849.00390625
sage: 849.00390625-839.96484375
9.03906250000000

With the patches:

sage: get_memory_usage()
841.9921875
sage: K = GF(101)
sage: %time L = [K(i) for i in xrange(10^6)]
CPU times: user 2.66 s, sys: 0.00 s, total: 2.66 s
Wall time: 2.66 s
sage: get_memory_usage()
851.03125
sage: 851.03125-841.9921875
9.03906250000000

So, the memory consumption in that scenario does not increase at all, and there is a little speedup.

I wonder why that is different with GF(next_prime(10^6))?

comment:32 in reply to: ↑ 29 Changed 9 years ago by novoselt

Replying to SimonKing:

Hi Andrey,

Replying to novoselt:

And as I understand, things don't break if you forget to do it, they just work slower than they should.

What scenario does your statement refer to? (1) The status quo, (2) the patch as it is, or (3) the patch without sage.structure.element.Element.__cached_methods?

Hi Simon, I was referring to (3) here, although I probably phrased it ambiguously. By "don't break" I meant that, as I understand, Sage will work and produce correct results if a user forgets to add a special caching attribute. But the cache will break and the speed will be lower. So I think that (2) is a better option.

comment:33 in reply to: ↑ 31 Changed 9 years ago by novoselt

Replying to SimonKing:

I wonder why that is different with GF(next_prime(10^6))?

If I am not mistaken, creation of elements for "small" and "big" fields is somewhat different, something like all elements are created and stored somewhere for small fields, but for big ones they are created one-by-one.

Personally, I think that even 10% is fine in memory increase if it leads to an efficient and uniform caching...

comment:34 Changed 9 years ago by nbruin

Responding to http://groups.google.com/group/sage-devel/msg/0aa5aca883603f64

Most machines have a limited resource (CPU cache) which means that in general, one expects that applications with a smaller memory footprint should have a tendency to run a bit faster. Given that I would have a slight preference towards option (3), i.e., keep elements small.

These things are notoriously difficult to quantify and are very processor dependent - good implementations of linear algebra libraries have to be tuned to be cache-friendly on each architecture.

The general principle is the following: General memory access is relatively slow, so CPUs have a cache (nowadays several levels) of fast memory so that repeated access to the same memory locations doesn't have to go all the way to main memory and can be done faster.

This works because memory usage tends to be temporally and spatially local. (temporal: the same memory locations tend to be accessed frequently before moving on; spatial: programs tend to access memory locations that are close together).

Making elements bigger will reduce both: If a CPU has enough cache to store N words and an element occupies s words, then a program will run quickly if its inner loop refers to at most N/s elements. As soon as it refers to more, the cache starts thrashing and you'll notice a poorer performance. As you see, small s is better. Your proposal would change s to s+1 ...

The spatial component comes in because caching rarely happens with a resolution of single words (because the overhead of storing which memory locations are cached). That's why it's good to have related data stored in consecutive memory. Increasing s also reduces that ... (apart from the fact that Python does not give us control over location of allocation anyway, but we can hope Python does this sanely).

Again, this is notoriously hard to quantify, but if you fix the problem and the architecture one can usually recognise drops in performance as data size increases, due to the different levels of cache starting to thrash. So, in general, small elements are better, both for memory use and for performance.

comment:35 follow-up: Changed 9 years ago by nbruin

Just to make sure that the CPU cache effect is real, I tried a little example

%cython
cpdef test(int nelts,int spacing,int repetitions):
    cdef int i,j,r
    cdef int *L
    cdef int k
    L=<int*>malloc(sizeof(int)*nelts*spacing)
    for r in xrange(repetitions):
        k=0
        for j in xrange(nelts):
            L[k]=L[k]+1
            k += spacing
    free(L)

The following program times incrementing memory cells 2^30 times, while varying the number of cells (nelts) and their spacing in memory (spacing).

for j in xrange(0,9):
    for i in xrange(1,29-j):
        tstart=os.times()[0]
        test(2**i,2**j,2**(30-i))
        tstop=os.times()[0]
        print [i,j,30-i,tstop-tstart]

The results are striking. Execution times vary by more than a factor 20 and one can see very definite thresholds where the execution time jumps. An excerpt from the output:

[1, 0, 29, 1.8100000000000001]
[2, 0, 28, 1.5999999999999996]
[3, 0, 27, 1.4800000000000004]
[4, 0, 26, 1.7299999999999995]
[5, 0, 25, 1.5400000000000009]
[6, 0, 24, 1.4599999999999991]
[7, 0, 23, 1.4100000000000001]
[8, 0, 22, 1.3800000000000008]
[9, 0, 21, 1.3699999999999992]
...
[16, 0, 14, 1.3399999999999999]
[17, 0, 13, 1.3399999999999999]
[18, 0, 12, 2.1700000000000017]
[19, 0, 11, 2.6799999999999997]
[20, 0, 10, 2.7299999999999969]
[21, 0, 9, 2.740000000000002]
...
[28, 0, 2, 2.8500000000000014]

[9, 1, 21, 1.3599999999999994]
...
[16, 1, 14, 1.3499999999999943]
[17, 1, 13, 2.9500000000000028]
[18, 1, 12, 5.1700000000000017]
[19, 1, 11, 5.2199999999999989]

[12, 2, 18, 1.3599999999999852]
[13, 2, 17, 1.539999999999992]
[14, 2, 16, 1.5500000000000114]
[15, 2, 15, 1.539999999999992]
[16, 2, 14, 4.8200000000000216]
[17, 2, 13, 9.2099999999999795]

[11, 3, 19, 1.3500000000000227]
[12, 3, 18, 3.2199999999999704]
[13, 3, 17, 3.2200000000000273]
[14, 3, 16, 3.2199999999999704]
[15, 3, 15, 8.7700000000000387]
[16, 3, 14, 17.019999999999982]

[10, 4, 20, 1.3700000000000045]
[11, 4, 19, 6.1399999999999864]
[12, 4, 18, 6.1400000000000432]
[13, 4, 17, 6.1499999999999773]
[14, 4, 16, 15.779999999999973]
[15, 4, 15, 32.220000000000027]
[16, 4, 14, 33.379999999999995]
[17, 4, 13, 33.879999999999995]
[18, 4, 12, 34.210000000000036]
[19, 4, 11, 34.439999999999941]

[9, 5, 21, 1.3799999999999955]
[10, 5, 20, 5.1399999999999864]
[11, 5, 19, 5.1599999999999682]
[12, 5, 18, 5.0099999999999909]
[13, 5, 17, 14.540000000000077]
[14, 5, 16, 31.549999999999955]
[15, 5, 15, 32.299999999999955]

[7, 6, 23, 1.3999999999998636]
[8, 6, 22, 1.4100000000000819]
[9, 6, 21, 5.3800000000001091]
[10, 6, 20, 5.3899999999998727]
[11, 6, 19, 5.3900000000001]
[12, 6, 18, 14.480000000000018]
[13, 6, 17, 31.399999999999864]

[6, 7, 24, 1.4399999999998272]
[7, 7, 23, 1.4800000000000182]
[8, 7, 22, 5.4300000000000637]
[9, 7, 21, 5.4100000000000819]
[10, 7, 20, 5.3999999999998636]
[11, 7, 19, 14.960000000000036]
[12, 7, 18, 31.470000000000027]
[13, 7, 17, 33.029999999999973]
[14, 7, 16, 33.600000000000136]

[5, 8, 25, 1.5399999999999636]
[6, 8, 24, 1.6100000000001273]
[7, 8, 23, 6.0899999999999181]
[8, 8, 22, 5.8699999999998909]
[9, 8, 21, 5.7300000000000182]
[10, 8, 20, 22.1700000000003]
[11, 8, 19, 41.649999999999636]
[12, 8, 18, 64.5300000000002]
[13, 8, 17, 69.190000000000055]
[14, 8, 16, 73.199999999999818]
[15, 8, 15, 74.820000000000164]
[16, 8, 14, 69.440000000000055]
[17, 8, 13, 69.579999999999927]
[18, 8, 12, 70.349999999999909]
[19, 8, 11, 69.730000000000018]
[20, 8, 10, 77.869999999999891]

comment:36 in reply to: ↑ 35 Changed 9 years ago by SimonKing

Hi Nils,

Replying to nbruin:

Just to make sure that the CPU cache effect is real, I tried a little example

Thank you for your clear explanation! I am currently testing some variations of my patch: Both for parents and for elements, one may or may not have __cached_methods - if one has, then it can always inherit cached methods from the category framework. And if one has the attribute, then one may or may not search it in the __getattr__ method - if one does then the speed penalty for missing attribute assignment is reduced.

I can already confirm that the timings are affected by the presence of the attribute. But I first need to finish my tests.

comment:37 Changed 9 years ago by SimonKing

  • Status changed from needs_review to needs_work

My tests seem to indicate that (at least on my CPU) __cached_methods for parents does not hurt, but __cached_methods for elements can yield a slow-down. Changes in the __getattr__ methods of parents or elements did not yield a significant slow-down.

So, I think it is a good idea to have __cached_methods for parents. It makes me wonder whether such attribute could serve an additional purpose: If the method resolution order does not find an attribute of a parent then it is searched in the category, by getattr_from_other_class. That takes a lot of time. So, shouldn't there be a shortpath so that looking up the same attribute becomes faster the second time?

That's to say: If a parent P inherits any attribute foo from the category then I suggest that P.foo should be stored in P.__cached_methods["foo"]. __getattr__ needs to search P.__cached_methods anyway, and a lookup in a dictionary should be faster than calling getattr_from_other_class.

Concerning elements, it meanwhile seems better to not have __cached_methods by default. But it should be possible that it is provided by the user, so that cached methods inherited from the category can be stored there. The __getattr__ of elements will test whether there is __cached_methods and search there.

So far, my tests indicate that the above approach is fine, speed-wise and memory-wise. My test is:

a = get_memory_usage()
K = GF(2)
%time L = [K(i) for i in xrange(10^7)]
get_memory_usage() - a
a = get_memory_usage()
K = GF(101)
%time L = [K(i) for i in xrange(10^7)]
get_memory_usage() - a
a = get_memory_usage()
K = GF(next_prime(10^6))
%time L = [K(i) for i in xrange(10^7)]
get_memory_usage() - a

So, it is formed by three scenarios. I first give the user CPU time and the memory consumption for sage-4.7.alpha5 plus the patches from #9976:

25.28 s, 78.49609375
25.11 s, 0.0 # so, GF(101) needs the same amount of memory than for GF(2)
88.20 s, 794.9375

Here are the same data for the current patches (i.e., with __cached_methods for both parents and elements):

25.73 s, 78.49609375
25.11 s, 0.0
92.89 s, 864.21484375

And here are the same data for a (not yet posted) patch that implements the ideas sketched above:

25.33 s, 78.49609375
24.49 s, 0.0
87.13 s, 794.93359375

I'll first catch some sleep, then I will add documentation about the use of __cached_methods, and of course I also need to run doctests.

comment:38 Changed 9 years ago by SimonKing

I just noticed that the additional attribute for parents could actually serve a third purpose: Providing lazy attributes for parent extension classes!

Namely, lazy attributes rely on the possibility to assign attributes. At least in the case of parent classes, if attribute assignment is impossible, they could do the next best thing and store stuff in __cached_methods!

comment:39 Changed 9 years ago by SimonKing

At the moment, it seems to me that the length of doc strings of cached_in_parent_method has a considerable influence (namely almost 4%) on the computation time, in some tests that actually do not involve a cached_in_parent_method. Frustrating. Can that really be? Should that be a reason to avoid documentation???

comment:40 Changed 9 years ago by SimonKing

Before I can submit a new patch, I must remove redundant tests from the documentation. It seems that if I make the documentation a little bit too long then the element creation tests become slower by 4%.

comment:41 Changed 9 years ago by SimonKing

  • Dependencies set to #9976
  • Description modified (diff)
  • Status changed from needs_work to needs_review

I have replaced the previous patches by a single patch. I am sorry in advance for the long post below.

It was indicated in previous posts that there was a certain danger of slowing down object creation. Also, one may think of startuptime. It seems to me that it is now solved.

I announced that I would like to use the new __cached_methods attribute of parents to store attributes that are obtained by slow attribute lookup in the category framework, and to make lazy attributes work on all parents.

Here are examples. I compare sage-4.7.alpha5 plus #9976 (referred to by "without patch") with sage-4.7.alpha5 plus #9976 plus the patch from here (referred to by "with patch").

1. Startuptime With patch:

...
== Slowest (including children) ==
1.155 sage.all (None)
0.275 sage.schemes.all (sage.all)
0.232 sage.misc.all (sage.all)
0.158 elliptic_curves.all (sage.schemes.all)
0.155 ell_rational_field (elliptic_curves.all)
...

Without patch:

...
== Slowest (including children) ==
1.229 sage.all (None)
0.278 sage.schemes.all (sage.all)
0.241 sage.misc.all (sage.all)
0.161 elliptic_curves.all (sage.schemes.all)
0.159 ell_rational_field (elliptic_curves.all)
...

=> no slow down.

2. Element creation

This is with patch. I indicate the corresponding timings without patch in the comments:

sage: a = get_memory_usage()
sage: K = GF(2)
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 25.23 s, sys: 0.02 s, total: 25.25 s
Wall time: 25.26 s  # without patch: 25.52 s
sage: get_memory_usage() - a
78.49609375   # same as without patch
sage: a = get_memory_usage()
sage: K = GF(101)
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 25.13 s, sys: 0.03 s, total: 25.16 s
Wall time: 25.16 s  # without patch: 26.20 s
sage: get_memory_usage() - a
0.0
sage: a = get_memory_usage()
sage: K = GF(next_prime(10^6))
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 87.40 s, sys: 0.24 s, total: 87.63 s
Wall time: 87.64 s   # without patch: 88.87 s
sage: get_memory_usage() - a
794.94140625   # without patch: 794.94921875

=> no slow down.

3. Cached methods in Python code

Here, I consider existing code, namely gens() and groebner_basis() for multivariate polynomial ideals. Again, the timings are with patch, the old timings are in comments

sage: P.<x,y> = QQ[]
sage: I = P*[x,y]
sage: timeit('I.gens()',number=10^7)
10000000 loops, best of 3: 142 ns per loop
# Without patch:
# 10000000 loops, best of 3: 703 ns per loop
sage: timeit('I.groebner_basis()',number=10^7)
10000000 loops, best of 3: 249 ns per loop
# Without patch:
# 10000000 loops, best of 3: 746 ns per loop

4. Accessing parent attributes that are inherited from the category, if the element class of the category does not appear in the method resolution order

It may happen that a parent is not an instance of the parent class of its category. That can be for two reasons: Either the category is not properly initialised (it is partially taken care of in #9944 and #9138), or it is a Cython class (if I remember correctly, they don't play well with dynamic classes). Then, attribute lookup must go beyond the usual method resolution order, and that's very slow. So, I suggest to use a shortpath in that case:

sage: P.<x,y> = QQ[]
sage: P._is_category_initialized()
False
sage: P._init_category_(Algebras(QQ))
sage: P.sum
<bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in x, y over Rational Field>
sage: P.sum.__module__
'sage.categories.commutative_additive_monoids'
sage: timeit('P.sum', number=10^6)
1000000 loops, best of 3: 755 ns per loop
# Without patch
# 1000000 loops, best of 3: 3.76 µs per loop
sage: P.__cached_methods['sum']
<bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in x, y over Rational Field>

5. Cython: Lazy attributes and cached methods inherit from the category

The following would not work at all without the patch.

We define a category (written in Cython) with some cached methods.

sage: cython_code = ["from sage.all import cached_method, cached_in_parent_method, Category",
... "class MyCategory(Category):",
... "    @cached_method",
... "    def super_categories(self):",
... "        return [Objects()]",
... "    class ElementMethods:",
... "        @cached_method",
... "        def element_cache_test(self):",
... "            return -self",
... "        @cached_in_parent_method",
... "        def element_via_parent_test(self):",
... "            return -self",
... "    class ParentMethods:",
... "        @cached_method",
... "        def one(self):",
... "            return self.element_class(self,1)",
... "        @cached_method",
... "        def invert(self, x):",
... "            return -x"]
sage: cython('\n'.join(cython_code))

Case 1

Define elements and parents as Python classes (but in Cython code), so that attribute assignment is possible. That's the easy case.

sage: cython_code = ["from sage.structure.element cimport Element",
... "class MyElement(Element):",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def raw_test(self):",
... "        return -self",
... "from sage.structure.parent cimport Parent",
... "class MyParent(Parent):",
... "    Element = MyElement"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: e = MyElement(P,5)

Cached method for the parent:

# Cache without arguments
sage: P.one()
<1>
sage: P.one() is P.one()
True
sage: timeit('P.one()', number=10^6)
1000000 loops, best of 3: 210 ns per loop
# Cache with arguments
sage: P.invert(e)
<-5>
sage: P.invert(e) is P.invert(e)
True
sage: timeit('P.invert(e)', number=10^6)
1000000 loops, best of 3: 815 ns per loop

Cached methods for elements (one with cache in the element, the other with cache in the parent):

# Cache without arguments
sage: e.element_cache_test()
<-5>
sage: e.element_cache_test() is e.element_cache_test()
True
sage: timeit('e.element_cache_test()', number=10^6)
1000000 loops, best of 3: 173 ns per loop

# Cached in parent
sage: e.element_via_parent_test()
<-5>
sage: e.element_via_parent_test() is e.element_via_parent_test()
True
sage: timeit('e.element_via_parent_test()', number=10^6)
1000000 loops, best of 3: 574 ns per loop

# Comparison with element creation:
sage: e.raw_test()
<-5>
sage: e.raw_test() is e.raw_test()
False
sage: timeit('e.raw_test()', number=10^6)
1000000 loops, best of 3: 1.57 µs per loop

Case 2

We use Cython classes for which attribute assignment is impossible. This is no problem at all, for the parent class. For the element class, we need to provide an additional public attribute __cached_methods, or things partially break:

sage: cython_code = ["from sage.structure.element cimport Element",
... "cdef class MyBrokenElement(Element):",
... "    cdef object x",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def raw_test(self):",
... "        return -self",
... "cdef class MyElement(Element):",
... "    cdef public dict __cached_methods",
... "    cdef object x",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def raw_test(self):",
... "        return -self",
... "from sage.structure.parent cimport Parent",
... "cdef class MyParent(Parent):",
... "    Element = MyElement"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: ebroken = MyBrokenElement(P,5)
sage: e = MyElement(P,5)

Every parent should have a lazy attribute element_class -- but so far, it used to fail on extension classes. That problem is solved by the patch:

sage: P.element_class
<type '_mnt_local_king__sage_temp_mpc622_29604_tmp_2_spyx_0.MyElement'>

If attribute assignment is impossible then the cache (for parents) still works, but gets a bit slower. Without the patch, the cache would break, and even a working cache would be slower.

sage: P.one()
<1>
sage: P.one() is P.one()
True
sage: timeit('P.one()', number=10^6)
1000000 loops, best of 3: 685 ns per loop
# Cache with arguments
sage: P.invert(e)
<-5>
sage: P.invert(e) is P.invert(e)
True
sage: timeit('P.invert(e)', number=10^6)
1000000 loops, best of 3: 1.06 µs per loop

We now consider the broken cache for elements. It turns out that the cached method for the "broken" element class can still be called, but is not cached. However, the cached-in-parent method is cached, but is terribly slow:

sage: ebroken.element_cache_test()
<-5>
# This is broken:
sage: ebroken.element_cache_test() is ebroken.element_cache_test()
False
sage: ebroken.element_via_parent_test()
<-5>
# This is not broken:
sage: ebroken.element_via_parent_test() is ebroken.element_via_parent_test()
True
# But it is very very slow:
sage: timeit('ebroken.element_via_parent_test()')
625 loops, best of 3: 31.2 µs per loop

However, the simple addition of a public dict __cached_methods make the cache work nicely (even though attribute assignment is still faster):

sage: e.element_cache_test()
<-5>
sage: e.element_cache_test() is e.element_cache_test()
True
sage: timeit('e.element_cache_test()', number=10^6)
1000000 loops, best of 3: 921 ns per loop
# Cached in parent
sage: e.element_via_parent_test()
<-5>
sage: e.element_via_parent_test() is e.element_via_parent_test()
True
sage: timeit('e.element_via_parent_test()', number=10^6)
1000000 loops, best of 3: 1.13 µs per loop
# Comparison with element creation
sage: timeit('e.raw_test()', number=10^6)
1000000 loops, best of 3: 696 ns per loop

6. Clear cache on pickle

There was a special class ClearCacheOnPickle in the cachfunc module. However, it was not sufficiently documented (and not provided with tests), and in fact it was broken. I made it work, and this is one is taken from the doc strings:

    In the following example, we create a Python class that inherits
    from multivariate polynomial ideals, but does not pickle cached
    values.  We provide the definition in Cython, however, since
    interactive Cython definitions provide introspection by trac
    ticket #9976, whereas Python definitions don't.
    ::

        sage: P.<a,b,c,d> = QQ[]
        sage: I = P*[a,b]
        sage: classdef = ['from sage.misc.cachefunc import ClearCacheOnPickle',
        ...    'from sage.all import QQ',
        ...    'P = QQ["a","b","c","d"]; I = P*[P.gen(0),P.gen(1)]',
        ...    'class MyClass(ClearCacheOnPickle,I.__class__):',
        ...    '    def __init__(self,ring,gens):',
        ...    '        I.__class__.__init__(self,ring,gens)',
        ...    '    def __getnewargs__(self):',
        ...    '        return (self._Ideal_generic__ring,self._Ideal_generic__gens)']
        sage: cython('\n'.join(classdef))

    We destroy the cache of two methods of ``I`` on purpose
    (demonstrating that the two different implementations of cached
    methods are correctly dealt with).  Pickling ``I`` preserves the
    cache::

        sage: I.gens.set_cache('bar')
        sage: I.groebner_basis.set_cache('foo',algorithm='singular')
        sage: J = loads(dumps(I))
        sage: J.gens()
        'bar'
        sage: J.groebner_basis(algorithm='singular')
        'foo'

    However, if we have an ideal that additionally descends from
    :class:`ClearCacheOnPickle`, the carefully corrupted cache is not
    pickled::

        sage: A = MyClass(P,[a,b])
        sage: A
        Ideal (a, b) of Multivariate Polynomial Ring in a, b, c, d over Rational Field
        sage: A.gens.set_cache('foo')
        sage: A.groebner_basis.set_cache('bar',algorithm='singular')
        sage: A.gens()
        'foo'
        sage: A.groebner_basis(algorithm='singular')
        'bar'
        sage: B = loads(dumps(A))
        sage: B.gens()
        [a, b]
        sage: B.groebner_basis(algorithm='singular')
        [a, b]
        sage: A.gens()
        'foo'

Doctests pass for me. So, I guess it is in need of review, again!

For the patchbot:

Depends on #9976

Apply trac11115-cached_cython.patch

Note: See TracTickets for help on using tickets.