Changes between Version 4 and Version 41 of Ticket #11115


Ignore:
Timestamp:
05/03/11 00:29:18 (3 years ago)
Author:
SimonKing
Comment:

I have replaced the previous patches by a single patch. I am sorry in advance for the long post below.

It was indicated in previous posts that there was a certain danger of slowing down object creation. Also, one may think of startuptime. It seems to me that it is now solved.

I announced that I would like to use the new __cached_methods attribute of parents to store attributes that are obtained by slow attribute lookup in the category framework, and to make lazy attributes work on all parents.

Here are examples. I compare sage-4.7.alpha5 plus #9976 (referred to by "without patch") with sage-4.7.alpha5 plus #9976 plus the patch from here (referred to by "with patch").

1. Startuptime With patch:

...
== Slowest (including children) ==
1.155 sage.all (None)
0.275 sage.schemes.all (sage.all)
0.232 sage.misc.all (sage.all)
0.158 elliptic_curves.all (sage.schemes.all)
0.155 ell_rational_field (elliptic_curves.all)
...

Without patch:

...
== Slowest (including children) ==
1.229 sage.all (None)
0.278 sage.schemes.all (sage.all)
0.241 sage.misc.all (sage.all)
0.161 elliptic_curves.all (sage.schemes.all)
0.159 ell_rational_field (elliptic_curves.all)
...

=> no slow down.

2. Element creation

This is with patch. I indicate the corresponding timings without patch in the comments:

sage: a = get_memory_usage()
sage: K = GF(2)
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 25.23 s, sys: 0.02 s, total: 25.25 s
Wall time: 25.26 s  # without patch: 25.52 s
sage: get_memory_usage() - a
78.49609375   # same as without patch
sage: a = get_memory_usage()
sage: K = GF(101)
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 25.13 s, sys: 0.03 s, total: 25.16 s
Wall time: 25.16 s  # without patch: 26.20 s
sage: get_memory_usage() - a
0.0
sage: a = get_memory_usage()
sage: K = GF(next_prime(10^6))
sage: %time L = [K(i) for i in xrange(10^7)]
CPU times: user 87.40 s, sys: 0.24 s, total: 87.63 s
Wall time: 87.64 s   # without patch: 88.87 s
sage: get_memory_usage() - a
794.94140625   # without patch: 794.94921875

=> no slow down.

3. Cached methods in Python code

Here, I consider existing code, namely gens() and groebner_basis() for multivariate polynomial ideals. Again, the timings are with patch, the old timings are in comments

sage: P.<x,y> = QQ[]
sage: I = P*[x,y]
sage: timeit('I.gens()',number=10^7)
10000000 loops, best of 3: 142 ns per loop
# Without patch:
# 10000000 loops, best of 3: 703 ns per loop
sage: timeit('I.groebner_basis()',number=10^7)
10000000 loops, best of 3: 249 ns per loop
# Without patch:
# 10000000 loops, best of 3: 746 ns per loop

4. Accessing parent attributes that are inherited from the category, if the element class of the category does not appear in the method resolution order

It may happen that a parent is not an instance of the parent class of its category. That can be for two reasons: Either the category is not properly initialised (it is partially taken care of in #9944 and #9138), or it is a Cython class (if I remember correctly, they don't play well with dynamic classes). Then, attribute lookup must go beyond the usual method resolution order, and that's very slow. So, I suggest to use a shortpath in that case:

sage: P.<x,y> = QQ[]
sage: P._is_category_initialized()
False
sage: P._init_category_(Algebras(QQ))
sage: P.sum
<bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in x, y over Rational Field>
sage: P.sum.__module__
'sage.categories.commutative_additive_monoids'
sage: timeit('P.sum', number=10^6)
1000000 loops, best of 3: 755 ns per loop
# Without patch
# 1000000 loops, best of 3: 3.76 µs per loop
sage: P.__cached_methods['sum']
<bound method Algebras.parent_class.sum of Multivariate Polynomial Ring in x, y over Rational Field>

5. Cython: Lazy attributes and cached methods inherit from the category

The following would not work at all without the patch.

We define a category (written in Cython) with some cached methods.

sage: cython_code = ["from sage.all import cached_method, cached_in_parent_method, Category",
... "class MyCategory(Category):",
... "    @cached_method",
... "    def super_categories(self):",
... "        return [Objects()]",
... "    class ElementMethods:",
... "        @cached_method",
... "        def element_cache_test(self):",
... "            return -self",
... "        @cached_in_parent_method",
... "        def element_via_parent_test(self):",
... "            return -self",
... "    class ParentMethods:",
... "        @cached_method",
... "        def one(self):",
... "            return self.element_class(self,1)",
... "        @cached_method",
... "        def invert(self, x):",
... "            return -x"]
sage: cython('\n'.join(cython_code))

Case 1

Define elements and parents as Python classes (but in Cython code), so that attribute assignment is possible. That's the easy case.

sage: cython_code = ["from sage.structure.element cimport Element",
... "class MyElement(Element):",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def raw_test(self):",
... "        return -self",
... "from sage.structure.parent cimport Parent",
... "class MyParent(Parent):",
... "    Element = MyElement"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: e = MyElement(P,5)

Cached method for the parent:

# Cache without arguments
sage: P.one()
<1>
sage: P.one() is P.one()
True
sage: timeit('P.one()', number=10^6)
1000000 loops, best of 3: 210 ns per loop
# Cache with arguments
sage: P.invert(e)
<-5>
sage: P.invert(e) is P.invert(e)
True
sage: timeit('P.invert(e)', number=10^6)
1000000 loops, best of 3: 815 ns per loop

Cached methods for elements (one with cache in the element, the other with cache in the parent):

# Cache without arguments
sage: e.element_cache_test()
<-5>
sage: e.element_cache_test() is e.element_cache_test()
True
sage: timeit('e.element_cache_test()', number=10^6)
1000000 loops, best of 3: 173 ns per loop

# Cached in parent
sage: e.element_via_parent_test()
<-5>
sage: e.element_via_parent_test() is e.element_via_parent_test()
True
sage: timeit('e.element_via_parent_test()', number=10^6)
1000000 loops, best of 3: 574 ns per loop

# Comparison with element creation:
sage: e.raw_test()
<-5>
sage: e.raw_test() is e.raw_test()
False
sage: timeit('e.raw_test()', number=10^6)
1000000 loops, best of 3: 1.57 µs per loop

Case 2

We use Cython classes for which attribute assignment is impossible. This is no problem at all, for the parent class. For the element class, we need to provide an additional public attribute __cached_methods, or things partially break:

sage: cython_code = ["from sage.structure.element cimport Element",
... "cdef class MyBrokenElement(Element):",
... "    cdef object x",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def raw_test(self):",
... "        return -self",
... "cdef class MyElement(Element):",
... "    cdef public dict __cached_methods",
... "    cdef object x",
... "    def __init__(self,P,x):",
... "        self.x=x",
... "        Element.__init__(self,P)",
... "    def __neg__(self):",
... "        return MyElement(self.parent(),-self.x)",
... "    def _repr_(self):",
... "        return '<%s>'%self.x",
... "    def __hash__(self):",
... "        return hash(self.x)",
... "    def raw_test(self):",
... "        return -self",
... "from sage.structure.parent cimport Parent",
... "cdef class MyParent(Parent):",
... "    Element = MyElement"]
sage: cython('\n'.join(cython_code))
sage: C = MyCategory()
sage: P = MyParent(category=C)
sage: ebroken = MyBrokenElement(P,5)
sage: e = MyElement(P,5)

Every parent should have a lazy attribute element_class -- but so far, it used to fail on extension classes. That problem is solved by the patch:

sage: P.element_class
<type '_mnt_local_king__sage_temp_mpc622_29604_tmp_2_spyx_0.MyElement'>

If attribute assignment is impossible then the cache (for parents) still works, but gets a bit slower. Without the patch, the cache would break, and even a working cache would be slower.

sage: P.one()
<1>
sage: P.one() is P.one()
True
sage: timeit('P.one()', number=10^6)
1000000 loops, best of 3: 685 ns per loop
# Cache with arguments
sage: P.invert(e)
<-5>
sage: P.invert(e) is P.invert(e)
True
sage: timeit('P.invert(e)', number=10^6)
1000000 loops, best of 3: 1.06 µs per loop

We now consider the broken cache for elements. It turns out that the cached method for the "broken" element class can still be called, but is not cached. However, the cached-in-parent method is cached, but is terribly slow:

sage: ebroken.element_cache_test()
<-5>
# This is broken:
sage: ebroken.element_cache_test() is ebroken.element_cache_test()
False
sage: ebroken.element_via_parent_test()
<-5>
# This is not broken:
sage: ebroken.element_via_parent_test() is ebroken.element_via_parent_test()
True
# But it is very very slow:
sage: timeit('ebroken.element_via_parent_test()')
625 loops, best of 3: 31.2 µs per loop

However, the simple addition of a public dict __cached_methods make the cache work nicely (even though attribute assignment is still faster):

sage: e.element_cache_test()
<-5>
sage: e.element_cache_test() is e.element_cache_test()
True
sage: timeit('e.element_cache_test()', number=10^6)
1000000 loops, best of 3: 921 ns per loop
# Cached in parent
sage: e.element_via_parent_test()
<-5>
sage: e.element_via_parent_test() is e.element_via_parent_test()
True
sage: timeit('e.element_via_parent_test()', number=10^6)
1000000 loops, best of 3: 1.13 µs per loop
# Comparison with element creation
sage: timeit('e.raw_test()', number=10^6)
1000000 loops, best of 3: 696 ns per loop

6. Clear cache on pickle

There was a special class ClearCacheOnPickle in the cachfunc module. However, it was not sufficiently documented (and not provided with tests), and in fact it was broken. I made it work, and this is one is taken from the doc strings:

    In the following example, we create a Python class that inherits
    from multivariate polynomial ideals, but does not pickle cached
    values.  We provide the definition in Cython, however, since
    interactive Cython definitions provide introspection by trac
    ticket #9976, whereas Python definitions don't.
    ::

        sage: P.<a,b,c,d> = QQ[]
        sage: I = P*[a,b]
        sage: classdef = ['from sage.misc.cachefunc import ClearCacheOnPickle',
        ...    'from sage.all import QQ',
        ...    'P = QQ["a","b","c","d"]; I = P*[P.gen(0),P.gen(1)]',
        ...    'class MyClass(ClearCacheOnPickle,I.__class__):',
        ...    '    def __init__(self,ring,gens):',
        ...    '        I.__class__.__init__(self,ring,gens)',
        ...    '    def __getnewargs__(self):',
        ...    '        return (self._Ideal_generic__ring,self._Ideal_generic__gens)']
        sage: cython('\n'.join(classdef))

    We destroy the cache of two methods of ``I`` on purpose
    (demonstrating that the two different implementations of cached
    methods are correctly dealt with).  Pickling ``I`` preserves the
    cache::

        sage: I.gens.set_cache('bar')
        sage: I.groebner_basis.set_cache('foo',algorithm='singular')
        sage: J = loads(dumps(I))
        sage: J.gens()
        'bar'
        sage: J.groebner_basis(algorithm='singular')
        'foo'

    However, if we have an ideal that additionally descends from
    :class:`ClearCacheOnPickle`, the carefully corrupted cache is not
    pickled::

        sage: A = MyClass(P,[a,b])
        sage: A
        Ideal (a, b) of Multivariate Polynomial Ring in a, b, c, d over Rational Field
        sage: A.gens.set_cache('foo')
        sage: A.groebner_basis.set_cache('bar',algorithm='singular')
        sage: A.gens()
        'foo'
        sage: A.groebner_basis(algorithm='singular')
        'bar'
        sage: B = loads(dumps(A))
        sage: B.gens()
        [a, b]
        sage: B.groebner_basis(algorithm='singular')
        [a, b]
        sage: A.gens()
        'foo'

Doctests pass for me. So, I guess it is in need of review, again!

For the patchbot:

Depends on #9976

Apply trac11115-cached_cython.patch

Legend:

Unmodified
Added
Removed
Modified