Opened 6 years ago

Last modified 4 years ago

#14214 needs_work enhancement

Cythoned homsets

Reported by: SimonKing Owned by: tbd
Priority: major Milestone: sage-6.4
Component: performance Keywords: Hom, cython, cached method
Cc: nthiery Merged in:
Authors: Simon King Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #14279 #18758 Stopgaps:

Description (last modified by SimonKing)

Homsets arise whenever a coercion map or a conversion map is involved. Hence, homsets are ubiquitous in Sage and should thus be as a fast as possible.

Therefore I suggest change sage.categories.homset into a cython module. A clean solution seems out of reach at the moment (see comments), and so I just provide a couple of speed-ups here.

Apply

  • trac_14214-cython_homset.patch
  • trac_14214-backup_cache.patch

Attachments (2)

trac_14214-backup_cache.patch (2.8 KB) - added by SimonKing 6 years ago.
Cache for Hom with default category
trac_14214-cython_homset.patch (5.8 KB) - added by SimonKing 6 years ago.

Download all attachments as: .zip

Change History (77)

comment:1 follow-up: Changed 6 years ago by SimonKing

Ideally, sage.categories.homset.Homset should be a cdef class, with cdef attributes _domain, _codomain, __category.

Problem and potential long term solutions

sage.modular.abvar.homspace.EndomorphimSubring inherits from sage.categories.homset.Homset and from sage.rings.ring.Ring. Making Homset cdef with cdef attributes results in incompatible base classes.

  • One could think of creating a new cdef class sage.categories.homset.EndomorphismRing inheriting from sage.rings.ring.Ring and copying the methods implemented in sage.categories.homset.Homset, and use it in sage.modular.abvar.homspace.
    • However, that's code duplication, and I doubt that we want that.
  • One could think of dropping sage.rings.ring.Ring and let EndomorphismSubring entirely rely on the ring functionality that is provided by the category framework.
    • I tried, but this means opening a can of worms. One would then also need to implement the new coercion model in sage.modular. But replacing EndomorphismSubring.__call__ by an _element_constructor_ does not work, because Homset has a __call__ method, too. Hence, we would then also need to use the new coercion model for sage.categories.homset. Yes, that's desirable. But it is not available in short time.
  • One could try to make some important basic methods of Homset fast, by inheritance from WithEqualityById that I introduced in #14054. It would be desirable that two homsets are equal if and only if they are identical if and only if both domain, codomain and category are identical.
    • Unfortunately, the old parts of sage (about elliptic curves and stuff) rely on the assumption that we do not have unique parents and do not have unique homsets. This would be another can of worms.

Short term improvements suggested in my patch

  • Changing sage.categories.homset into a Cython module means: We can cdef the homset cache, and can thus use fast get/set methods for TripleDict. That's a speed-up.
  • One can make Homset cdef, but keep the _domain and _codomain attributes Python attributes. Then, there are no layout conflicts.
  • I suggest to change __category into a single underscore attribute, to avoid name mangling.
  • Accessing a Python attribute _domain defined for a Homset can be surprisingly slow. It is fast, if one has an instance of Homset itself. But it is slow if one considers an instance of a sub-class, such as EndomorphismSubring. It turns out that calling a cached method is faster than accessing the attribute! Hence, I suggest using cached method decorators on the domain() and codomain() methods (these methods are internally used in EndomorphismSubring anyway).
  • As a step towards uniqueness, I suggest that Homset.__cmp__ tests by identity rather than equality, and uses cmp only on non-identical domain/codomain/category.

Making Homset cdef requires to put Set_generic into sage.structure.parent.pxd.

Timings will be in the next post.

comment:2 follow-up: Changed 6 years ago by SimonKing

  • Status changed from new to needs_review

Timings with the patch:

sage: E = J0(11)
sage: F = J0(22)
sage: H = Hom(ZZ,QQ)
sage: HE = Hom(E,E)
sage: HF = Hom(E,F)
sage: D = {H:1,HE:2,HE:3}

sage: timeit("Hom(E,E)", number = 10^4)
10000 loops, best of 3: 9.25 µs per loop
sage: timeit("Hom(E,F)", number = 10^4)
10000 loops, best of 3: 122 µs per loop
sage: timeit("Hom(ZZ,QQ)", number = 10^4)
10000 loops, best of 3: 12 µs per loop

sage: timeit("hash(HE)", number = 10^6)
1000000 loops, best of 3: 2.41 µs per loop
sage: timeit("hash(H)", number = 10^6)
1000000 loops, best of 3: 1.46 µs per loop
sage: timeit("hash(HF)", number = 10^6)
1000000 loops, best of 3: 1.49 µs per loop

sage: timeit("H in D", number = 10^6)
1000000 loops, best of 3: 1.48 µs per loop
sage: timeit("HE in D", number = 10^6)
1000000 loops, best of 3: 2.38 µs per loop
sage: timeit("HF in D", number = 10^6)
1000000 loops, best of 3: 1.48 µs per loop

sage: timeit("HF.domain()", number = 10^6)
1000000 loops, best of 3: 392 ns per loop
sage: timeit("HE.domain()", number = 10^6)
1000000 loops, best of 3: 415 ns per loop
sage: timeit("H.domain()", number = 10^6)
1000000 loops, best of 3: 356 ns per loop
sage: timeit("HF._domain", number = 10^6)
1000000 loops, best of 3: 524 ns per loop
sage: timeit("HE._domain", number = 10^6)
1000000 loops, best of 3: 881 ns per loop
sage: timeit("H._domain", number = 10^6)
1000000 loops, best of 3: 502 ns per loop

sage: %timeit TestSuite(H).run()
100 loops, best of 3: 7.01 ms per loop
sage: %timeit TestSuite(HE).run(skip=['_test_elements'])
10 loops, best of 3: 38.8 ms per loop
sage: %timeit TestSuite(HF).run(skip=['_test_elements'])
10 loops, best of 3: 91.3 ms per loop

Timings without the patch:

sage: timeit("hash(HE)", number = 10^6)
1000000 loops, best of 3: 3.97 µs per loop
sage: timeit("hash(H)", number = 10^6)
1000000 loops, best of 3: 2.74 µs per loop
sage: timeit("hash(HF)", number = 10^6)
1000000 loops, best of 3: 2.7 µs per loop

sage: timeit("Hom(E,E)", number = 10^4)
10000 loops, best of 3: 12.9 µs per loop
sage: timeit("Hom(E,F)", number = 10^4)
10000 loops, best of 3: 132 µs per loop
sage: timeit("Hom(ZZ,QQ)", number = 10^4)
10000 loops, best of 3: 21.6 µs per loop

sage: timeit("H in D", number = 10^6)
1000000 loops, best of 3: 2.77 µs per loop
sage: timeit("HE in D", number = 10^6)
1000000 loops, best of 3: 4 µs per loop
sage: timeit("HF in D", number = 10^6)
1000000 loops, best of 3: 2.81 µs per loop

sage: timeit("HF.domain()", number = 10^6)
1000000 loops, best of 3: 1.13 µs per loop
sage: timeit("HE.domain()", number = 10^6)
1000000 loops, best of 3: 1.74 µs per loop
sage: timeit("H.domain()", number = 10^6)
1000000 loops, best of 3: 1.18 µs per loop
sage: timeit("HF._domain", number = 10^6)
1000000 loops, best of 3: 520 ns per loop
sage: timeit("HE._domain", number = 10^6)
1000000 loops, best of 3: 933 ns per loop
sage: timeit("H._domain", number = 10^6)
1000000 loops, best of 3: 522 ns per loop

sage: %timeit TestSuite(H).run()
100 loops, best of 3: 7.79 ms per loop
sage: %timeit TestSuite(HE).run(skip=['_test_elements'])
10 loops, best of 3: 42.3 ms per loop
sage: %timeit TestSuite(HF).run(skip=['_test_elements'])
1 loops, best of 3: 95.7 ms per loop

Conclusion

  • There is a clear improvement in all timings.

Some details I am not yet totally happy with (todo on a different ticket?):

  • Getting a homset out of the cache could still be faster. The time-limiting factor is that one needs to find an appropriate category, if it is not explicitly passed as an argument. But since the cache is weak anyway, couldn't we simply cache the same homset twice, namely as Hom(X,Y) and as Hom(X,Y,C)? This would give a considerable speed-up.
  • The hash is much too slow. Ideally, one would simply inherit from WithEqualityById, but that's non-trivial, as I have pointed out above.

Doctests pass for me. Needs review!

comment:3 Changed 6 years ago by SimonKing

Hooray, the startup_time plugin reports:

+With 37% confidence, startup time decreased by at least 0.5%
+With 82% confidence, startup time decreased by at least 0.25%
+With 91% confidence, startup time decreased by at least 0.1%

I think it would only be significant if it was 95% confidence. Nevertheless, it is a decrease of the startup time!

comment:4 in reply to: ↑ 2 ; follow-up: Changed 6 years ago by nthiery

Replying to SimonKing:

Some details I am not yet totally happy with (todo on a different ticket?):

  • Getting a homset out of the cache could still be faster. The time-limiting factor is that one needs to find an appropriate category, if it is not explicitly passed as an argument. But since the cache is weak anyway, couldn't we simply cache the same homset twice, namely as Hom(X,Y) and as Hom(X,Y,C)? This would give a considerable speed-up.

Yes, I don't why we should not cache Hom(X,Y). By the way, is there a reason why Hom is not a (weak) cached function rather than handling it's cache by hand?

Thanks for your work on this! Get in touch with me in case no one volunteers shortly for the review.

Cheers,

Nicolas

comment:5 in reply to: ↑ 4 ; follow-up: Changed 6 years ago by SimonKing

Replying to nthiery:

Yes, I don't why we should not cache Hom(X,Y).

OK, doing it soon.

In fact, what I do now: Try to implement the new coercion framework for all homsets (hence: Remove __call__, use element classes). I am already getting less than a couple of hundred errors...

By the way, is there a reason why Hom is not a (weak) cached function rather than handling it's cache by hand?

Yes, actually three reasons:

  • A weak cached function has a weak reference on the value, but not on the keys. But we need weak references on the keys, for otherwise a memory leak arises.
  • In the coercion framework, the domain of a map must be identical (and not just equal) to the parents of elements-to-be-coerced. Hence, if X' is equal to X, then we need that Hom(X,Y) is not identical with Hom(X',Y). But a weak cached function would compare by equality, and this would give rise to errors in those parts of Sage that still work with non-unique parents. TripleDict does compare the keys by identity.
  • We know that the keys are triples (domain, codomain, category). I think we have benchmarks showing that TripleDict is faster than a python dictionary.

comment:6 Changed 6 years ago by SimonKing

The second patch is not tested yet. But it yields:

sage: sage: timeit("Hom(E,E)", number = 10^4)
10000 loops, best of 3: 1.9 µs per loop
sage: sage: timeit("Hom(E,F)", number = 10^4)
10000 loops, best of 3: 1.43 µs per loop
sage: sage: timeit("Hom(ZZ,QQ)", number = 10^4)
10000 loops, best of 3: 1.52 µs per loop

Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch

comment:7 Changed 6 years ago by SimonKing

The patchbot finds one failing test, and this is actually a test that relies on non-unique parent behaviour:

        Coercing a vector space morphism into the parent of a second vector
        space morphism will unify their parents. ::

            sage: U = QQ^3
            sage: V = QQ^4
            sage: W = QQ^3
            sage: X = QQ^4
            sage: H = Hom(U, V)
            sage: K = Hom(W, X)

            sage: A = matrix(QQ, 3, 4, [0]*12)
            sage: f = H(A)
            sage: B = matrix(QQ, 3, 4, range(12))
            sage: g = K(B)
            sage: f.parent() is g.parent()
            False

            sage: h = H(g)
            sage: f.parent() is h.parent()
            True

But in this example, at least with my patch, we have U is W and H is K. So, I think this test can be removed. Will do so in a minute.

comment:8 Changed 6 years ago by SimonKing

Done!

Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch

comment:9 in reply to: ↑ 5 ; follow-up: Changed 6 years ago by nbruin

Replying to SimonKing:

... those parts of Sage that still work with non-unique parents.

What follows is a bit of a rant that probably doesn't immediately affect your patch here. However, it's something I've seen crop up around the coercion/category framework in the last couple of weeks a little and I think has some relevance for work in this direction in general:

One design consideration I haven't seen mentioned much is that some parent constructors have very good reasons to return a fresh copy every time one is requested (i.e., don't do the cached_function metaclass thing that UniqueRepresentation does). The reason is that some parents are complicated structures (they tend to model complicated mathematical objects) that have too many aspects about them for them to be implemented as immutable. An example is EllipticCurve, where a Mordell-Weil basis is definitely not part of the construction data, but the basis choice (if found at all!) can be important and dealing with that unexpectedly changing is just too painful.

One could design these things differently, e.g., don't make the Mordell-Weil basis part of the elliptic curve itself, but the reality is that design choice hasn't been made in sage and changing that design now will be very difficult.

The problem arises elsewhere too: Sometimes you DO want multiple isomorphic-but-not-identical copies of objects. It's convenient to have multiple bivariate polynomial rings over QQ, for instance. We have that in sage by making generator names part of the definition of polynomial rings.

For other objects, for instance elliptic curves, there are no convenient generator names to distinguish multiple copies (the fundamental constructor being EllipticCurve([a1,a2,a3,a4,a6])), but there is still a reasonable need for having multiple nonidentical copies (never mind the practical mutability properties).

It terms of the coercion framework I have hopes these are not too much of an issue. I think elliptic curves are sufficiently high up the chain of complexity that people are happy to apply explicit homomorphisms to move things around, and hopefully this is true for most objects (the elephants in the room here are number fields: they are expected to behave nicely wrt coercion and yet are sufficiently complex that they have data hanging off them that makes it hard to consider them immutable. A lot of the problems may be hidden because it's relatively hard to run into exactly the same representation of the field twice)

The consistent thing for such non-unique-by-necessity parents is to make "==" indeed just "is", so that uniqueness is simply forced. This means

EllipticCurve([1,2,3,4,5]) != EllipticCurve([1,2,3,4,5])

which is fine with me, since they are isomorphic but not uniquely so. However, I'm not sure how easy it will be to convince other people of such a strict interpretation of "==".

Of course such EqualityById but non-CachedRepresentative parents are useless as dictionary keys in most situations in the sense that D[p]=a can always be written as p.D=a (the only way of getting a is by having a hold of p itself anyway and there is no way to regain a hold of p once you've lost it), but that might not be too much of an issue.

Anyway, please keep in mind: It's very unlikely (and probably even undesirable) that all parents will ever be essentially CachedRepresentative (or equivalent) and convenience requirements will likely prevent us from making all of them EqualityById. I suspect this puts bounds on how far you can push the coercion framework and various homomorphism caching strategies.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 6 years ago by SimonKing

Replying to nbruin:

Replying to SimonKing:

... those parts of Sage that still work with non-unique parents.

What follows is a bit of a rant that probably doesn't immediately affect your patch here.

Since the patch doesn't touch the way how things are cached, it doesn't affect the patch. But it does affect what I try to do next, so, let's discuss it.

One design consideration I haven't seen mentioned much is that some parent constructors have very good reasons to return a fresh copy every time one is requested (i.e., don't do the cached_function metaclass thing that UniqueRepresentation does). ...

For other objects, for instance elliptic curves, there are no convenient generator names to distinguish multiple copies (the fundamental constructor being EllipticCurve([a1,a2,a3,a4,a6])), but there is still a reasonable need for having multiple nonidentical copies (never mind the practical mutability properties).

It terms of the coercion framework I have hopes these are not too much of an issue.

They are an issue, if the equality-by-id paradigma is used in some parts and isn't used in other parts of the coercion framework. They are not an issue, if the coercion framework consistently uses equality-by-id even on non-unique parents, as I am now explaining.

I am fine with parents P1, P2 that evaluate equal but are non-identical. But then, I think it makes sense that Hom(P1,X) and Hom(P2,X) are non-identical, too. The question is whether they should evaluate equal, and here I actually have no very strong opinion, except that I like fast hash and fast equality test...

Let f be a coercion map from P1 to X. Now, we consider an element p of P2. In elder versions of Sage, it used to be the case that f was stored in a usual dictionary as an attribute of X. "Usual dictionary" means: If you have P2 and inspect the cache for a coercion to X, you'll find f. But f.domain() is not the same as P2.

Being "not the same as" P2 means: We can't simply coerce p via f---we must first coerce p into P1 and then apply f. Namely, if P1==P2, we certainly expect that there is a coercion from P2 to P1. But it could be that, for example, a non-trivial base change is involved in this coercion.

This is why now the coercion maps are stored in a MonoDict, whose keys are compared by identity: Looking for a coercion from P2 to X, you will correctly find that it has not been found yet, and so you have the chance to detect the coercion from P2 to X via P1 with a non-trivial base change.

I think this is a sane approach and it works fine with non-unique parents.

This is why I think that non-unique parents are not an issue in the coercion framework---but it is essential to be consistent, i.e., inside of the coercion framework one should consistently use comparison by identity even for non-unique parents (as in the P1-versus-P2 example above).

The consistent thing for such non-unique-by-necessity parents is to make "==" indeed just "is", so that uniqueness is simply forced. This means

EllipticCurve([1,2,3,4,5]) != EllipticCurve([1,2,3,4,5])

which is fine with me, since they are isomorphic but not uniquely so. However, I'm not sure how easy it will be to convince other people of such a strict interpretation of "==".

I think it is enough to convince the coercion model...

Of course such EqualityById but non-CachedRepresentative parents are useless as dictionary keys

In fact, this would be a problem. We can not simply use CachedRepresentation here, because this uses comparison-by-== (and, as I tried to explain, in the coercion model we want comparison-by-id even for non-unique parents).

But couldn't we make Homset have the ClasscallMetaclass, inherit from EqualityById and use the current cache logic from Hom in its __classcall__ method? The advantage would be that in some parts of Sage Homsets are created without calling the Hom function. Putting the Hom-logic into a __classcall__, we would have Homsets cached in this case, too!

Anyway, please keep in mind: It's very unlikely (and probably even undesirable) that all parents will ever be essentially CachedRepresentative (or equivalent) and convenience requirements will likely prevent us from making all of them EqualityById.

Inside of the coercion framework, I think nothing prevents us from comparing non-unique parents by identity---except old code. In that way, we have freedom to choose non-trivial coercions between non-identical copies of a parent.

comment:11 Changed 6 years ago by SimonKing

PS: What I currently try on top of this patch is merely to implement the new coercion framework (_element_constructor_ and friends) for Homsets. Only if this is achieved, I would think of changing the cache logic.

comment:12 Changed 6 years ago by SimonKing

One remark on the reliability of the startup_time plugin:

With the previous version of the patches, we had:

-With 25% confidence, startup time increased by at least 0.25%
-With 44% confidence, startup time increased by at least 0.1%
+With 82% confidence, startup time decreased by at least 0.5%
+With 97% confidence, startup time decreased by at least 0.25%
+With 99.3% confidence, startup time decreased by at least 0.1%

Hence, we had a significant speed-up of 0.25% of startup time.

With the current version of the patches (which only differ in the removal of one doc test, but there is no change in the code!!), we have

-With 25% confidence, startup time increased by at least 0.25%
-With 44% confidence, startup time increased by at least 0.1%
+With 85% confidence, startup time increased by at least 0.5%
+With 98% confidence, startup time increased by at least 0.25%
+With 99.4% confidence, startup time increased by at least 0.1%

Hence, we have a significant slow-down of 0.25% of startup time.

But since the code didn't change, perhaps it isn't significant, after all?

comment:13 in reply to: ↑ 10 Changed 6 years ago by nbruin

Replying to SimonKing:

I think it is enough to convince the coercion model...

Yes, I agree that the coercion model can work with "is" even if "==" doesn't. It may surprise people every now and again, but the alternative is almost certainly worse (besides being less efficient).

In fact, this would be a problem. We can not simply use CachedRepresentation here, because this uses comparison-by-== (and, as I tried to explain, in the coercion model we want comparison-by-id even for non-unique parents).

Good catch. However, the comparison-by-== happens on construction parameters. Hence, if someone implements some equal-but-not-identical rings R and S then we'll have

sage: R=my_equal_but_not_identical_ring()
sage: S=my_equal_but_not_identical_ring()
sage: P1=PolynomialRing(R,'x')
sage: P2=PolynomialRing(S,'x') #this will look up P1
sage: P1 is P2
True
sage: P2.base_ring() is R
True
sage: S(1)+P2.1
Error: No coercion from S into P2.

That's a serious problem. If you're implementing something that can be used in further (functorial) (cached) parent constructions that are expected to have coercions they have to be EqualityById. Otherwise you get the mess as above.

I doubt things like elliptic curves disappear into such cached parent constructions, which was why I thought the problem might not be that bad.

But couldn't we make Homset have the ClasscallMetaclass, inherit from EqualityById and use the current cache logic from Hom in its __classcall__ method?

Homsets are important in coercion and have proper mathematical meaning. If you make them consider their arguments by identity, you're basically telling people "you can use == as an equivalence relation on your class to some extent, but as far as maps are concerned, id is what really matters". I'm fine with that, but you are restricting people's ability to define and use == as they see fit.

Inside of the coercion framework, I think nothing prevents us from comparing non-unique parents by identity---except old code. In that way, we have freedom to choose non-trivial coercions between non-identical copies of a parent.

I think you're right on that in principle. There is a bit of an issue that "coercions between equal-but-not identical" parents are fundamentally between ""equivalent" objects, whereas the coercion framework seems to work best when there's a hierarchy. It's an invitation to loops in the coercion graph, inviting non-commuting paths. However, I think that's more an issue of the mathematical problem that of how we're modelling it.

comment:14 Changed 6 years ago by SimonKing

Needed to update both patches, because a patch in #14159 has changed.

Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch

comment:15 Changed 6 years ago by SimonKing

  • Status changed from needs_review to needs_work
  • Work issues set to Do not change cmp too much

I made some changes to the __cmp__ method of homsets. The cmp behaviour changed in a way that is a problem in my attempt to use the new coercion model for morphisms.

Hence, cmp should be changed, and this ticket needs work!

comment:16 Changed 6 years ago by SimonKing

  • Description modified (diff)
  • Status changed from needs_work to needs_review

Updated!

Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch

comment:17 follow-up: Changed 6 years ago by nbruin

Do you have a reason to make domain and codomain cached? Their implementation already consists of just an attribute lookup. Are you getting a performance increase from it? There definitely are drawbacks:

  • the method gets wrapped, so there's extra junk sitting in memory for that
  • the cache takes up memory for an attribute that's already there anyway.

comment:18 in reply to: ↑ 17 ; follow-up: Changed 6 years ago by SimonKing

Replying to nbruin:

Do you have a reason to make domain and codomain cached? Their implementation already consists of just an attribute lookup. Are you getting a performance increase from it?

Yes. See comment:2. A cached method is much much faster than a Python method that does nothing but return an attribute, and it is a little faster than directly accessing the attribute in a "timeit".

Things are slightly different if you access an attribute inside of a method. At some point, I had added a test method that accesses self._domain a million times or calls self.domain() a million times. In the case of HF in comment:2, the latter was faster, but in other cases (it seems: In cases with a short mro), direct attribute access was faster.

Anyway, that's why I like to keep the cached method and self._domain and self._codomain around.

Note, however, that one could get rid of self._domain and self._codomain. Namely, instead of

    self._domain = X

one could do

    self.domain.set_cache(X)

in Homset.__init__, and change the cached method into

@cached_method
def domain(self):
    pass

Note that keeping the _domain and _codomain attributes is actually a cheat:

  • The purpose of this ticket is to make Homset a cdef class.
  • I would like to make _codomain and _domain cdef attributes, because this would be faster than a cached method. But this would result in a layout conflict, as stated in comment:1
  • If you try to directly call Homset to create an instance, you would get an error, because you can't easily assign attributes to a cdef class. Hence, I work with the implicit assumption that all homsets are actually instances of non-cdef classes inheriting from Homset.

Do you see a way to solve this problem in a clean way (i.e., with cdef attributes)? I'd appreciate to know!

comment:19 in reply to: ↑ 18 ; follow-up: Changed 6 years ago by nbruin

Replying to SimonKing:

Ah, I see your problems. So the issue is that for Homset methods to function, both self._domain and self._codomain must be accessible, but with your patch, neither of these attributes can be assigned or set. In that sense Homset is a kind of "abstract" base class: only subclasses can be instantiated. I think that is wrong, so in my opinion this puts the currently proposed solution on the wrong side of positive review. I think it's really quite a bad hack and we both know that "temporary" solutions rarely are that.

Perhaps declaring _domain and _codomain properties rather than cdef attributes solves the problem? Then at least providing the storage for those is solved within the class itself. You could use the store that the cached domain and codomain provide to store and fetch the value, since the natural thing -- a cdef attribute -- doesn't seem to work due to layout issues.

The alternative solution is to hack it so that the Homset type has tp_dictoffset initialized, if you absolutely want to store _domain and _codomain as dynamic attributes. This is something the cython folk have thought about in response to declaring a __dict__ attribute, but it didn't make it in (because it would disable various method lookup optimizations that we wouldn't have to disable here, if I understand the discussions there).

Note, however, that one could get rid of self._domain and self._codomain. Namely, instead of

    self._domain = X

one could do

    self.domain.set_cache(X)

Right ... with declaring them properties I suspect you could make this transparent for _domain and _codomain. Then we can still support them.

in Homset.__init__, and change the cached method into

@cached_method
def domain(self):
    pass

Note that keeping the _domain and _codomain attributes is actually a cheat:

Yes, thanks for explaining. Unfortunately that makes me really opposed to the current solution as explained above :-).

  • I would like to make _codomain and _domain cdef attributes, because this would be faster than a cached method. But this would result in a layout conflict, as stated in comment:1

I think they should be cdef attributes because that would allow for way faster access in cdef-type-declared cython code than property access would.

However, earlier experiments I've done make me hopeful that property access on python level will be fully competitive with cached methods (and even with python access of cdef public attributes, because I'm sure those are implemented via property).

The "proper" solution is to bite the bullet and resolve the layout conflict, either by code duplication (or can we avoid that via templating?) or by changing the inheritance structure. I can see why you don't want to do that on this ticket, but I think we need a better temporary solution than relying on attributes that can only be instantiated on a python subclass (which prevents us from having fully cdef Homset subclasses!)

comment:20 Changed 6 years ago by nbruin

  • Status changed from needs_review to needs_work

comment:21 Changed 6 years ago by tscrim

  • Dependencies changed from #14159, #12951 to #14159, #12951, #13184

Dependency on #13184 which modifies homset.py.

comment:22 in reply to: ↑ 19 ; follow-up: Changed 6 years ago by SimonKing

Replying to nbruin:

Perhaps declaring _domain and _codomain properties rather than cdef attributes solves the problem? Then at least providing the storage for those is solved within the class itself.

How?

I guess you refer to this documentation. It seems that you still need to provide your own way to store the property: The storage place is not provided automatically.

Hence, to provide some value in a property, you need to store the value, e.g., in a cdef attribute or in a usual Python attribute. You'd run into the same problems that made me suggest to use cached methods.

Properties have a getter, a setter and a deleter. These three methods are provided by cached methods: set_cache is the setter, calling it is the getter, and clear_cache is the deleter.

So, when you suggest to use properties, then I'd say dropping the attributes _domain and _codomain and using @cached_method instead (setting the domain or codomain during __init__ by means of set_cache) is more than a temporary hack.

You could use the store that the cached domain and codomain provide to store and fetch the value, since the natural thing -- a cdef attribute -- doesn't seem to work due to layout issues.

Right, there is the __cached_methods attribute. This could be used to store that values. Would be slower than calling a cached method without arguments, though.

  • I would like to make _codomain and _domain cdef attributes, because this would be faster than a cached method. But this would result in a layout conflict, as stated in comment:1

I think they should be cdef attributes because that would allow for way faster access in cdef-type-declared cython code than property access would.

However, earlier experiments I've done make me hopeful that property access on python level will be fully competitive with cached methods (and even with python access of cdef public attributes, because I'm sure those are implemented via property).

But properties still need to access the storage space. The storage space of a CachedMethodCallerNoArgs is very fast, since you won't even need a dictionary lookup (since it does not use __cached_methods for storing the value, in contrast to cached methods with arguments!).

... but I think we need a better temporary solution than relying on attributes that can only be instantiated on a python subclass (which prevents us from having fully cdef Homset subclasses!)

OK. Then dropping _domain and _codomain and using @cached_methods as an elegant substitute for properties seems the right thing to do to me.

comment:23 in reply to: ↑ 22 Changed 6 years ago by SimonKing

Replying to SimonKing:

Right, there is the __cached_methods attribute. This could be used to store that values. Would be slower than calling a cached method without arguments, though.

No, it is more tricky. If you have a class that allows attribute assignment, then __cached_method is indeed not used. But if you have a class that does not allow attribute assignment, then __cached_method is used to hold the cached method caller itself (i.e., it does not contain the cache but the cached method).

What do you think of doing the following in __init__:

    try:
        self._domain = D
    except AttributeError:
        self.__cached_method['_domain'] = D

This would be enough to provide _domain as an attribute. For speed reasons, it might still be a good idea to have a cached method domain().

comment:24 follow-up: Changed 6 years ago by nbruin

For me, the self.__cached_method['_domain'] = D does not work. Furthermore, are you sure property access is not competitive? My tests seem to indicate it is:

sage: cython("""
sage: from sage.misc.cachefunc import cached_method
sage: from sage.structure.parent cimport Parent
sage: cdef class T(sage.structure.parent.Parent):
...       @cached_method
...       def C(self):
...           return 10
...       property B:
...           def __get__(self):
...               return self.__cached_methods['C'].cache
...           def __set__(self,value):
...               self.__cached_methods['C'].cache=value
...
sage: """)
sage: A=T()
sage: A.C()
10
sage: A.B
10
sage: timeit('A.C()')
625 loops, best of 3: 389 ns per loop
sage: timeit('A.B')
625 loops, best of 3: 229 ns per loop
sage: A.C()
20

comment:25 Changed 6 years ago by nbruin

Oh, OK, but WHERE things are stored may change if the subclass has a writeable attribute dictionary, so we have to be a little more careful:

sage: cython("""
sage: from sage.misc.cachefunc import cached_method
sage: from sage.structure.parent cimport Parent
sage: cdef class T(sage.structure.parent.Parent):
...       @cached_method
...       def C(self):
...           return 10
...       property B:
...           def __get__(self):
...               if self.__cached_methods is not None:
...                   return self.__cached_methods['C'].cache
...               else:
...                   return self.C.cache
...           def __set__(self,value):
...               if self.__cached_methods is not None:
...                   self.__cached_methods['C'].cache=value
...               else:
...                   self.C.cache=value
...
sage: """)
sage: A=T()
sage: timeit('A.C()')
625 loops, best of 3: 389 ns per loop
sage: timeit('A.B')
625 loops, best of 3: 201 ns per loop
sage: class TT(T): pass
sage: D=TT()
sage: timeit('D.C()')
625 loops, best of 3: 198 ns per loop
sage: timeit('D.B')
625 loops, best of 3: 209 ns per loop
sage: D.v=10
sage: timeit('D.v')
625 loops, best of 3: 202 ns per loop

and indeed, now we see that a cached method (if possible) inserts things a little further down the MRO if possible and hence gets a speed increase.

comment:26 follow-up: Changed 6 years ago by nbruin

Hm, two observations:

  • the tests below do not corroborate that cached function lookup is faster than simple attribute lookup on python classes
  • It seems method resolution on extension classes is WAY slower than on object derived classes.

First Parent derived.

sage: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class T(sage.structure.parent.Parent):
:    w=100
:    def __init__(self):
:        self.v=100
:    @cached_method
:    def t(self):
:        return 100
:--
sage: 
sage: P=T
sage: for i in [1..1000]:
....:         P=type("T%d"%i,(P,),{"a%i"%i:1,"b%i"%i:2})
....:         globals()["T%d"%i]=P
....:     
sage: C=T1000(); c0=T()
sage: timeit('C.v',number=20000)
20000 loops, best of 3: 26 µs per loop
sage: timeit('c0.v',number=20000)
20000 loops, best of 3: 111 ns per loop
sage: timeit('C.w',number=20000)
20000 loops, best of 3: 22.6 µs per loop
sage: timeit('c0.w',number=20000)
20000 loops, best of 3: 73.8 ns per loop
sage: timeit('C.t()',number=20000)
20000 loops, best of 3: 38.3 µs per loop
sage: timeit('c0.t()',number=20000)
20000 loops, best of 3: 101 ns per loop

As you can see, everything on C is way slower than on c0, probably because there's a 1000 deep MRO to contend with. However, I see no indication that cached_method is ever faster than a straight attribute lookup.

Next object derived:

sage: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class T(object):
:    w=100
:    def __init__(self):
:        self.v=100
:    @cached_method
:    def t(self):
:        return 100
:--
sage: 
sage: P=T
sage: for i in [1..1000]:
....:         P=type("T%d"%i,(P,),{"a%i"%i:1,"b%i"%i:2})
....:         globals()["T%d"%i]=P
....:     
sage: C=T1000(); c0=T()
sage: timeit('C.v',number=200000)
200000 loops, best of 3: 69.8 ns per loop
sage: timeit('c0.v',number=200000)
200000 loops, best of 3: 66.7 ns per loop
sage: timeit('C.w',number=200000)
200000 loops, best of 3: 53.6 ns per loop
sage: timeit('c0.w',number=200000)
200000 loops, best of 3: 54.1 ns per loop
sage: timeit('C.t()',number=200000)
200000 loops, best of 3: 77.4 ns per loop
sage: timeit('c0.t()',number=200000)
200000 loops, best of 3: 77.6 ns per loop

No penalty at all for large inheritance depth! Same pattern: instance attribute is a little slower than a class attribute and cached_method is always a little slower than either.

comment:27 in reply to: ↑ 24 ; follow-up: Changed 6 years ago by SimonKing

Replying to nbruin:

For me, the self.__cached_method['_domain'] = D does not work.

Should be self.__cached_methods['_domain'] = D with "s".

Furthermore, are you sure property access is not competitive? My tests seem to indicate it is:

sage: cython("""
sage: from sage.misc.cachefunc import cached_method
sage: from sage.structure.parent cimport Parent
sage: cdef class T(sage.structure.parent.Parent):
...       @cached_method
...       def C(self):
...           return 10
...       property B:
...           def __get__(self):
...               return self.__cached_methods['C'].cache
...           def __set__(self,value):
...               self.__cached_methods['C'].cache=value
...
sage: """)
sage: A=T()
sage: A.C()
10
sage: A.B
10
sage: timeit('A.C()')
625 loops, best of 3: 389 ns per loop
sage: timeit('A.B')
625 loops, best of 3: 229 ns per loop
sage: A.C()
20

I did not say that it is not competitive. I said that a CachedMethodCallerNoArgs is a property, since it has a setter and a getter, with the only difference that for accessing it you need to do A.C() and not just A.C.

If you would define it as a property, then you needed to implement setter and getter by yourself. A cached method is more comfortable to use.

comment:28 in reply to: ↑ 26 ; follow-up: Changed 6 years ago by SimonKing

Replying to nbruin:

No penalty at all for large inheritance depth! Same pattern: instance attribute is a little slower than a class attribute and cached_method is always a little slower than either.

That's interesting. I can reproduce it, but it totally contradicts all my previous experience with cached methods.

comment:29 in reply to: ↑ 27 ; follow-up: Changed 6 years ago by nbruin

Replying to SimonKing:

I said that a CachedMethodCallerNoArgs is a property, since it has a setter and a getter

It doesn't have a setter as far as I can see, so it's a "nondata descriptor". Python treats those differently, e.g., they can be overridden by an instance attribute (which is very important for your application!)

with the only difference that for accessing it you need to do A.C() and not just A.C.

which means extra overhead: After looking up A.C, executing C.__get__, there is still a call to be processed on the returned object from that! A "property" would already be done by then.

If you would define it as a property, then you needed to implement setter and getter by yourself. A cached method is more comfortable to use.

but with more overhead.

I suspect that (an optimized version of) lazy_attribute is potentially better for argumentless "cached methods". Incidentally, looking at the code there, it seems that lazy_attribute prefers to store the result in __cached_methods rather than use setattr to override the descriptor. Overriding the descriptor (i.e., storing the value in the instance attribute dict if available) should be faster and preferable to looking up in __cached_methods.

Shouldn't lazy_attribute be cythonized too?

comment:30 in reply to: ↑ 28 Changed 6 years ago by nbruin

Replying to SimonKing:

That's interesting. I can reproduce it, but it totally contradicts all my previous experience with cached methods.

I think we got to the root of this: up to 0.18, cython would compile its classes without "attribute lookup cache support". Some unfortunate semantics in python mean that for instance attributes, the whole MRO needs to be walked, so they form the worst case. I think that's why you were finding cached_method faster sometimes: That's a class attribute lookup so likely succeeds a little earlier. With a deep MRO that might be enough to offset the overhead from calling that's involved in a cached_method.

In all likelihood from 0.19 onwards (or earlier if we patch cython in sage), will enable "attribute lookup cache support" on the classes it defines, in which case the result from a full MRO lookup will be cached (also when it doesn't find anything), so after the first access, we don't have to walk the MRO anymore. This speeds up all attribute access considerably (sage doctesting seems to speed up something like 5-10% on average) and it makes normal instance attribute access faster than a cached_method call.

Given that, I think we SHOULD generally prefer instance attributes over no-arg cached_method.

In fact, with the attribute lookup cache in place, the difference between a property an a cached_method becomes even more pronounced:

sage: cython("""
....: include "../ext/stdsage.pxi"
....: include "../ext/python_bool.pxi"
....: from sage.misc.cachefunc import cached_method
....: from sage.structure.parent cimport Parent
....: cdef class T(Parent):
....:         cdef public int D
....:         @cached_method
....:         def C(self):
....:                 return 10
....:         property B:
....:                 def __get__(self):
....:                         return self.__cached_methods['c']
....:                 def __set__(self,value):
....:                         self.__cached_methods['c']=value
....:             
....:             """)
....:             
sage: A=T()
sage: A.C()
10
sage: A.B=10
sage: timeit('A.C()',number=800000,repeat=30)
800000 loops, best of 30: 183 ns per loop
sage: timeit('A.B',number=800000,repeat=30)
800000 loops, best of 30: 58.1 ns per loop
sage: timeit('A.D',number=800000,repeat=30)
800000 loops, best of 30: 47 ns per loop

For reference, without attribute cache this is:

sage: timeit('A.C()',number=800000,repeat=30)
800000 loops, best of 30: 195 ns per loop
sage: timeit('A.B',number=800000,repeat=30)
800000 loops, best of 30: 68.5 ns per loop
sage: timeit('A.D',number=800000,repeat=30)
800000 loops, best of 30: 60.7 ns per loop

So in order of decreasing precedence, storage should be:

  • instance attribute (cdef public or __dict__ based if available)
  • __cached_methods dict based data descriptor (really, that's just an instance attribute dictionary, so we could probably do away with that if we can have an instance __dict__. There's no inherent reason why cython cdef class instances can't have a __dict__. I guess the cython people are holding off on implementing it because they want it to play nice with the cython level too (which we wouldn't particularly care about)
  • cached_method.

The main advantage of cached_method is its spelling. So if we could make a version of @cached_method (say @computed_property) that produces a data descriptor (based on self.__cached_methods), where the __get__ can basically be what is now the __call__ you'd get a different spelling and better performance out of noarg cached "methods". If you make sure that __set__ is NULL you could even use self.__dict__ to cache the result and never get called on __get__ again (you'd provide a nondata descriptor, so you'd get shadowed by the instance attribute once installed)

comment:31 Changed 6 years ago by SimonKing

Sorry, I lost track on this ticket. I am now trying to catch up.

comment:32 in reply to: ↑ 29 Changed 6 years ago by SimonKing

  • Work issues changed from Do not change cmp too much to Use property for domain and codomain

The "Work issue" field was totally outdated, because "cmp" has already been addressed---see comment:16

Currently, the issue is the "property" thingy. I will do some tests (as we have seen elsewhere, using @property in Cython is slower than "property Bla: def get...".

comment:33 Changed 6 years ago by SimonKing

  • Work issues changed from Use property for domain and codomain to How to store domain and codomain?

This is why I am not happy with "property":

sage: cython("""
from sage.structure.parent cimport Parent
cdef class Bla(Parent):
    property _domain:
        def __get__(self):
            try:
                return self.__cached_methods['_domain']
            except KeyError:
                raise AttributeError("%s instance has no attribute '_domain'"%type(self))
            except TypeError:
                self.__cached_methods = {}
                raise AttributeError("%s instance has no attribute '_domain'"%type(self))
        def __set__(self, value):
            try:
                self.__cached_methods['_domain'] = value
            except TypeError:
                self.__cached_methods = {}
                self.__cached_methods['_domain'] = value
""")
....: 
sage: class Blubb(Bla): pass
sage: b = Bla()
sage: c = Blubb()
sage: b._domain = 1
sage: c._domain = 1
sage: c._codomain = 2
sage: %timeit x = b._domain
10000000 loops, best of 3: 133 ns per loop
sage: %timeit x = c._domain
10000000 loops, best of 3: 144 ns per loop
sage: %timeit x = c._codomain
10000000 loops, best of 3: 110 ns per loop

So, the property becomes slower on the subclass than on the base class, and a usual attribute on the subclass is faster than the property on the base class.

I don't know if a solution to this problem is proposed in your comments (perhaps "cythoning lazy attributes"?).

comment:34 Changed 6 years ago by SimonKing

I don't know why, but it seems that putting something into __cached_methods does not mean that it will be available as an attribute. But this used to be the case! What is happening here?

comment:35 Changed 6 years ago by SimonKing

This is Parent.__getattr__:

    def __getattr__(self, str name):
        if (name.startswith('__') and not name.endswith('_')) or self._category is None:
            dummy_error_message.cls = type(self)
            dummy_error_message.name = name
            raise dummy_attribute_error
        try:
            return self.__cached_methods[name]
        except KeyError:
            attr = getattr_from_other_class(self, self._category.parent_class, name)
            self.__cached_methods[name] = attr
            return attr
        except TypeError:
            attr = getattr_from_other_class(self, self._category.parent_class, name)
            self.__cached_methods = {name:attr}
            return attr

So, it 'should' be possible to get the argument from __cached_methods, isn't it?

comment:36 Changed 6 years ago by SimonKing

Aha! Now I see the problem. In my example above, I did not properly initialise the category.

comment:37 Changed 6 years ago by SimonKing

sage: cython("""from sage.all import Sets
from sage.structure.parent cimport Parent
cdef class Bla(Parent):
    def __init__(self, domain, codomain):
        try:
            self._domain_ = domain
            self._codomain_ = codomain
        except AttributeError:
            if self.__cached_methods is None:
                self.__cached_methods = {'_domain_':domain, '_codomain_':codomain}
            else:
                self.__cached_methods['_domain_'] = domain
                self.__cached_methods['_codomain_'] = codomain
        Parent.__init__(self, category= Sets())
""")
....: 
sage: class Blubb(Bla): pass
sage: b = Bla(1,2)
sage: c = Blubb(1,2)
sage: %timeit x = b._domain_
1000000 loops, best of 3: 826 ns per loop
sage: %timeit x = c._domain_
10000000 loops, best of 3: 114 ns per loop

Hmmmm. If it is possible to set attributes directly, then it is faster than a property. But otherwise, a property is much faster than using __cached_method for attribute lookup, even if this property uses __cached_method. And the property tends to become slower on sub-classes.

What a dilemma...

comment:38 Changed 6 years ago by SimonKing

My frustration level raises.

Ideally, one would write a descriptor

  • that takes the value from __cached_methods, because this is much faster than relying on __getattr__ (which looks into __cached_methods as well)
  • that can be overridden on the instance's __dict__, because python attribute access is faster than accessing the property, and in addition the property access tends to get slower on sub-classes.

Didn't you say that a property that only defines __get__ can be overridden in the instance's dict? So, I thought I just define __get__ and not set, and put data into __dict__ when possible.

However, it does not work:

sage: cython("""
....: from sage.all import Sets
....: from sage.structure.parent cimport Parent
....: cdef class Bla(Parent):
....:     def __init__(self, D,C):
....:         try:
....:             self.__dict__['_domain'] = D
....:             self.__dict__['_codomain'] = C
....:         except AttributeError:
....:             if self.__cached_methods is None:
....:                 self.__cached_methods = {'_domain':D, '_codomain':C}                                                                                                                                
....:             else:
....:                 self.__cached_methods['_domain'] = D                                                                                                                                                
....:                 self.__cached_methods['_codomain'] = C                                                                                                                                              
....:         Parent.__init__(self, category = Sets())
....:     property _domain:
....:         def __get__(self):                                                                                                                                                                          
....:             try:                                                                                                                                                                                    
....:                 return self.__cached_methods['_domain']
....:             except KeyError:                                                                                                                                                                        
....:                 raise AttributeError("%s instance has no attribute '_domain'"%type(self))
....: """)
....: 
sage: class Blubb(Bla): pass                                                                                                                                                                              
sage: b = Bla(1,2)
sage: c = Blubb(1,2)
sage: b._domain                                                                                                                                                                                           
1                                                                                                                                                                                                         
sage: c._domain                                                                                                                                                                                          
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-19-9505626a8e7d> in <module>()
----> 1 c._domain

/home/simon/SAGE/prerelease/sage-5.9.rc0/local/lib/python2.7/site-packages/sage/structure/parent.so in sage.structure.parent.Parent.__getattr__ (sage/structure/parent.c:6501)()

/home/simon/SAGE/prerelease/sage-5.9.rc0/local/lib/python2.7/site-packages/sage/structure/misc.so in sage.structure.misc.getattr_from_other_class (sage/structure/misc.c:1591)()

AttributeError: 'Blubb_with_category' object has no attribute '_domain'
sage: c.__dict__['_domain']
1

So, can you please clarify: What kind of descriptors can be overridden in __dict__?

comment:39 Changed 6 years ago by SimonKing

Apparently it is "new style versus old style classes:

sage: class Bla(object):
....:     @property
....:     def _domain(self):
....:         return 1
....:     
sage: b = Bla()
sage: b._domain
1
sage: b._domain = 3
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-25-355f52a2815c> in <module>()
----> 1 b._domain = Integer(3)

AttributeError: can't set attribute
sage: b.__dict__['_domain'] = 3
sage: b._domain
1

versus

sage: class Bla:
    @property
    def _domain(self):
        return 1
....:     
sage: b = Bla()
sage: b._domain
1
sage: b._domain = 4
sage: b._domain
4
sage: del b.__dict__['_domain']
sage: b._domain
1
sage: b.__dict__['_domain'] = 3
sage: b._domain
3

So, we are stuck.

Last edited 6 years ago by SimonKing (previous) (diff)

comment:40 follow-up: Changed 6 years ago by nbruin

I think you want the following code:

from sage.all import Sets
from sage.structure.parent cimport Parent

cdef class CachedAttributeType(object):
    cdef public str name
    def __init__(self, name):
        self.name = name
    def __get__(self,instance,cls):
        try:
            return instance.__cached_methods[self.name]
        except KeyError:
            raise AttributeError("%s instance has no attribute '%s'"%(cls,self.name))

cdef class Bla(Parent):
    def __init__(self, D,C):
        try:
            self.__dict__['_domain'] = D
            self.__dict__['_codomain'] = C
        except AttributeError:
            if self.__cached_methods is None:
                self.__cached_methods = {'_domain':D, '_codomain':C}
            else:
                self.__cached_methods['_domain'] = D
                self.__cached_methods['_codomain'] = C
        Parent.__init__(self, category = Sets())
    _domain=CachedAttributeType("_domain")
    _codomain=CachedAttributeType("_codomain")

Timings we'll be getting with the cython upgrade (with Python's attribute lookup cache enabled)

sage: b=Bla("dommi","codommi")
sage: b._domain
'dommi'
sage: %timeit b._domain
10000000 loops, best of 3: 97.6 ns per loop
sage: c=Cla("dommi","codommi")
sage: %timeit c._domain
10000000 loops, best of 3: 50 ns per loop

With the old cython (where MRO depth is a big factor in attribute lookup):

sage: %timeit b._domain
10000000 loops, best of 3: 130 ns per loop
sage: %timeit c._domain
10000000 loops, best of 3: 78.5 ns per loop

What we should really do is get rid of __cached_methods make sure these objects have a __dict__ instead, where such information can be stored much more effectively. There is an "empty dict" value one can use as a place holder for empty dictionaries. There's really no good reason why cython types cannot have a dictionary.

comment:41 Changed 6 years ago by SimonKing

What do you think? Should we create a new dependency for this ticket, namely: "Use cython for lazy_attribute and create a fast non-data descriptor"?

comment:42 Changed 6 years ago by SimonKing

  • Work issues changed from How to store domain and codomain? to Rebase; how to store domain and codomain?

Needs to be rebased, as it seems .

comment:43 Changed 6 years ago by SimonKing

I don't see what additional dependencies we need for the first patch.

Changed 6 years ago by SimonKing

Cache for Hom with default category

comment:44 Changed 6 years ago by SimonKing

Updated.

Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch

Changed 6 years ago by SimonKing

comment:45 Changed 6 years ago by SimonKing

Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch

comment:46 Changed 6 years ago by SimonKing

  • Work issues changed from Rebase; how to store domain and codomain? to How to store domain and codomain?

comment:47 follow-up: Changed 6 years ago by SimonKing

Both patches are updated, I guess we can now focus on the remaining issue with properties.

I asked whether it'd make sense to cythonize lazy_attribute. I think it does not, because we want to override __name__ of an instance which won't work in a cdef class.

In addition, it would be bad for speed if the property needed to first look up a name and then try to find this name as a key of __cached_methods. Would it be acceptable to have two specialised non-data descriptors with fixed names _domain and _codomain, rather than implementing a general non-data descriptor that can take any name?

comment:48 in reply to: ↑ 47 ; follow-up: Changed 6 years ago by nbruin

Replying to SimonKing:

Both patches are updated, I guess we can now focus on the remaining issue with properties.

I asked whether it'd make sense to cythonize lazy_attribute. I think it does not, because we want to override __name__ of an instance which won't work in a cdef class.

Do you mean for documentation and code introspection purposes? I guess that might be an issue. Shouldn't you just have a fast cythoned lazy_attribute for when performance matters (provided that making such a thing suitable for general consumption doesn't slow it down too much -- otherwise people should just be rolling their own when they need one) and a python version (would a normal (non-cdeffed) class in cython do?) that has all the documentation, introspecion bells and whistles. You may still get a benefit if you inherit from the fast implementation?

In addition, it would be bad for speed if the property needed to first look up a name and then try to find this name as a key of __cached_methods.

I don't think it's really measurable. I added a descriptor that avoids the lookup:

cdef class SpecializedCachedAttributeType(object):
    def __init__(self, name):
        if name !="_codomain":
            raise ValueError("I don't do that sort of thing")
    def __get__(self,instance,cls):
        try:
            return instance.__cached_methods["_codomain"]
        except KeyError:
            raise AttributeError("%s instance has no attribute '%s'"%(cls,"_codomain"))

and changed the initialization to

    _domain=CachedAttributeType("_domain")
    _codomain=SpecializedCachedAttributeType("_codomain")

I'm getting:

sage: %timeit B._codomain
10000000 loops, best of 3: 82.9 ns per loop
sage: %timeit B._domain
10000000 loops, best of 3: 84 ns per loop

which is close enough together that it's hard to call the difference significant (with the old cython I'm getting 114 ns for each). If you store that name on a cdef attribute, the lookup is pretty fast.

Would it be acceptable to have two specialised non-data descriptors with fixed names _domain and _codomain, rather than implementing a general non-data descriptor that can take any name?

I don't think it's worth it. You'll gain much more if you allow HomType instances to have a __dict__ (although a drawback there is that all cpdef methods will slow down; they'd need to be split in a cdef and a def again, or cython should grow a way of calling cpdef methods "without instance dict attribute override check", for which the infrastructure is already available and is used internally in cython for the python wrapper of the cpdef method)

Last edited 6 years ago by nbruin (previous) (diff)

comment:49 in reply to: ↑ 48 Changed 6 years ago by nbruin

Replying to nbruin:

I asked whether it'd make sense to cythonize lazy_attribute. I think it does not, because we want to override __name__ of an instance which won't work in a cdef class.

Actually, I think this is of very limited value anyway. You have to work very hard to get your hands on a descriptor, let alone its __name__, because its very nature makes that most attempts will result in executing the descriptor and getting its result. You really have to go digging into dictionaries to get the descriptor object itself. By the time you have succeeded at that, you can probably get at attributes on the object too, so having the "name" won't be of huge value.

The main disadvantage of descriptors versus no-arg methods is that it's hard put documentation on them that's accessible. The big advantage is of course that they're faster.

EDIT: after looking at the current implementation of lazy_attribute I see what's happening:

    def __get__(self,instance,cls):
        if instance is None: #this is when the attribute gets requested on the class itself
            return self
        ...

so that indeed gives a relatively easy way to get the descriptor object itself. Do people actually use that? If so, it would indeed be reasonable to make the documentation available on it. If it's not possible to put it on the cythonized lazy_attribute (have you tried just putting slots on there with the appropriate names and values?), you could just return a custom made wrapper that does have the documentation at that point, i.e.,

            return WrapperWithDocumentation(self)

where WrapperWithDocumentation can extract the required documentation from self.f and put it on itself.

So cythonizing lazy_attribute is definitely possible and it should lose quite a bit of overhead, because the function can be stored in a cython slot (so hardly any further lookup overhead on that, just the call). The question is of course if this is ever a bottleneck ... I suspect that giving parents a __dict__ for shadowing attributes (and hence largely making __cached_methods redundant) will be a much greater win.

Last edited 6 years ago by nbruin (previous) (diff)

comment:50 follow-up: Changed 6 years ago by nbruin

A very rough implementation:

from sage.misc.lazy_attribute import lazy_attribute

cdef class LazyAttribute:
    cdef public f
    cdef public str __name__
    def __init__(self,f):
        self.f=f
        self.__name__=f.__name__
    def __get__(self,instance,cls):
        if instance is None:
            return lazy_attribute(self.f)
        v=self.f(instance)
        setattr(instance,self.__name__,v)
        return v

Note I'm just reusing the old lazy_attribute here to deal with documentation issues.

sage: class T(object):
...       @LazyAttribute
...       def t(self):
...           """docstring of f"""
...           return 1000
...       @lazy_attribute
...       def s(self):
...           """docstring of f"""
...           return 1000
sage: timeit("T().t",repeat=10,number=10000)
10000 loops, best of 10: 374 ns per loop
sage: timeit("T().s",repeat=10,number=10000)
10000 loops, best of 10: 2.93 µs per loop

If I just copy over the __get__ method from the original lazy_attribute the difference is a little less pronounced, because that code does a lot more. After changing self.f.__name__ to self.__name__ I'm getting 1.2 µs, so that still more than twice as fast.

comment:51 in reply to: ↑ 50 Changed 6 years ago by SimonKing

Replying to nbruin:

A very rough implementation:

from sage.misc.lazy_attribute import lazy_attribute

cdef class LazyAttribute:
    cdef public f
    cdef public str __name__
    def __init__(self,f):
        self.f=f
        self.__name__=f.__name__
    def __get__(self,instance,cls):
        if instance is None:
            return lazy_attribute(self.f)
        v=self.f(instance)
        setattr(instance,self.__name__,v)
        return v

If I understand correctly, you optimize the case that the instance allows attribute assignment. But I think that won't be relevant in our use case: If there is attribute assignment (let's call it "case 1"), then we will use it during __init__, and hence an existing non-data descriptor will be overridden.

So, the choice of an implementation of the non-data descriptor will only matter if there is no attribute assignment available (let's call it "case 2"). And if I am not mistaken, your code will be very slow in case 2, and certainly not faster than a @lazy_attribute.

I think the questions to be addressed here are these:

  • Do we care about case 2 Homsets? Currently, all Homsets are case 1, so, we could decide that Homset is a base class, and only sub-classes allow instantiation.
  • If we care about case 2: How can we be as fast as possible?

If I understand correctly, using Cython's "property" would be faster than a @lazy_attribute in case 2, but it would make it impossible to use regular attributes in case 1, whence a massive slow-down in case 1 (which is not acceptable, as case 1 is the normal case).

But can we benefit from a Cython implementation of lazy_attribute so that it becomes faster in case 2 but still results in a non-data descriptor that can be overridden by an attribute in case 1?

Last edited 6 years ago by SimonKing (previous) (diff)

comment:52 Changed 6 years ago by nbruin

I guess our discussion got side-tracked a little bit:

  • For _domain and _codomain there is no reason to use lazy_attribute. In absence of a dict (where our code doesn't matter) we need a descriptor that looks up the value elsewhere.
  • what we should really have a cdef public because then one and the same location is fast from C and python writes the python access code for us. But for that we have inheritance problems at the moment.

Then there's the second bit, cythonization of lazy_attribute:

  • a proper cythonized version of lazy_attribute is faster in all cases, mostly by a factor 2 as far as I could see. Even with required __cached_methods storage, I get attribute retrieval (once cached) of 84.6 ns rather than 547 ns

That actually means you COULD put a cythonized lazy_attribute on _domain and _codomain. The overhead compared to a custom nondatadescriptor is minimal, and within a factor 2 of a dict lookup. Here's the code I mean:

cdef class LazyAttribute:
    cdef public f
    cdef public str __name__
    def __init__(self,f):
        self.f=f
        self.__name__=f.__name__
    def __get__(self,a,cls):
        cdef CM
        cdef result
        if a is None:
            return lazy_attribute(self.f)
        try:
            # __cached_methods is supposed to be a public Cython attribute.
            # Apparently, these are *not* subject to name mangling.
            CM = getattr(a, '__cached_methods')
            if CM is None:
                CM = {}
                setattr(a, '__cached_methods', CM)
        except AttributeError,msg:
            CM = None
        if CM is not None:
            try:
                return CM[self.__name__]
            except KeyError:
                pass
        result = self.f(a)
        if result is NotImplemented:
            # Workaround: we make sure that cls is the class
            # where the lazy attribute self is actually defined.
            # This avoids running into an infinite loop
            # See About descriptor specifications
            for supercls in cls.__mro__:
                if self.f.__name__ in supercls.__dict__ and self is supercls.__dict__[self.__name__]:
                    cls = supercls
            return getattr(super(cls, a),self.__name__)
        try:
            setattr(a, self.__name__, result)
        except AttributeError:
            if CM is not None:
                CM[self.f.__name__] = result
                return result
            raise
        return result

There's a bit of clean-up to do on this concerning returning a documented object when requested on the bare class. It works now, but only by reusing the old implementation. Is returning the descriptor object itself on the bare class customary/desirable behaviour?

comment:53 Changed 6 years ago by nbruin

see #14615 for cythonized lazy_attribute. Its costs seem quite reasonable.

comment:54 in reply to: ↑ 40 ; follow-up: Changed 6 years ago by nbruin

Ping!. The present patches here pass doctests on the bot!

I think conceptually what you want in this case is really just a nondata descriptor that acts as a backstopper to provide (essentially) a read-only attribute stored in __cached_methods if no instance dictionary is available (or more accurately: if it isn't used) The code at 40 provides that.

cdef class CachedAttributeType(object):
    cdef public str name
    def __init__(self, name):
        self.name = name
    def __get__(self,instance,cls):
        cdef dict CM
        try:
            CM = getattr(instance,"__cached_methods")
            return CM[self.name]
        except TypeError, KeyError:
            raise AttributeError("%s instance has no attribute '%s'"%(cls,self.name))

Since the __get__ here is very close the one of the faster code paths in a properly cythonized lazy_attribute (see #14615), defining the backed attribute either way:

  _domain=CachedAttributeType("_domain")
  
  @lazy_attribute
  def _codomain(self);
      raise AttributeError("You should have set me already!")

is going to give the same performance. You may be able to shave off a few nanoseconds by replacing some of the auto-generated cython code with explicit Python C-API calls. I suspect we'll have to live with many parents not having a __dict__ for the foreseeable future.

(note that any slowdowns due to deeper MRO will go away once the cython upgrade is in).

comment:55 in reply to: ↑ 54 ; follow-up: Changed 6 years ago by SimonKing

Replying to nbruin:

Ping!. The present patches here pass doctests on the bot!

I think conceptually what you want in this case is really just a nondata descriptor that acts as a backstopper to provide (essentially) a read-only attribute stored in __cached_methods if no instance dictionary is available (or more accurately: if it isn't used) The code at 40 provides that.

Agreed.

The first question is: Where shall we put the code?

cdef class CachedAttributeType(object):
    cdef public str name
    def __init__(self, name):
        self.name = name
    def __get__(self,instance,cls):
        cdef dict CM
        try:
            CM = getattr(instance,"__cached_methods")
            return CM[self.name]
        except TypeError, KeyError:
            raise AttributeError("%s instance has no attribute '%s'"%(cls,self.name))

I guess you either mean getattr(instance, "__cached_methods", None) or except (AttributeError,KeyError): (in any case, it should be with brackets, otherwise it will not catch the KeyError).

And I guess you could get it still a bit faster, if you do def __get__(self, Parent instance, cls), because our homsets are parents, and because the cdef public attribute __cached_methods could be accessed more quickly if the instance is known to be a Parent.

Since the __get__ here is very close the one of the faster code paths in a properly cythonized lazy_attribute (see #14615), defining the backed attribute either way:

  _domain=CachedAttributeType("_domain")
  
  @lazy_attribute
  def _codomain(self);
      raise AttributeError("You should have set me already!")

is going to give the same performance.

And that's the second question: Are you really sure that lazy_attribute will be as fast as CachedAttributeType in the case of no __dict__?

Note that I tested to cythonize lazy_attribute and to cimport Parent. It failed (circular import or so). So, we come back to the first question: Where shall we put the code? I think it makes sense to cdefine instance being Parent, but this would be impossible in sage/misc/lazy_attribute.pyx.

And the third question is: Shall we really make this ticket depend on a proper cythonized version of lazy_attribute? Note that currently the lazy attribute or cached attribute would never come into play, because all our homsets are Python subclasses of the (now cythonized) Homset.

comment:56 in reply to: ↑ 55 ; follow-ups: Changed 6 years ago by nbruin

Replying to SimonKing:

And I guess you could get it still a bit faster, if you do def __get__(self, Parent instance, cls), because our homsets are parents, and because the cdef public attribute __cached_methods could be accessed more quickly if the instance is known to be a Parent.

Excellent idea! if __cached_method isn't in a slot then we wouldn't want to use it anyway. It does mean that we lose flexibility: If for some reason, we have a class that can't inherit directly from Parent but does want to cooperate in this protocol, it won't be able to.

And that's the second question: Are you really sure that lazy_attribute will be as fast as CachedAttributeType in the case of no __dict__?

My timings suggested this. However, I suspect that providing cdef access to __cached_method will be a (small) measurable gain, so with that specialization you should be faster.

Note that I tested to cythonize lazy_attribute and to cimport Parent. It failed (circular import or so).

That can't be lazy_attribute, or at least not my version. It doesn't have any imports!

So, we come back to the first question: Where shall we put the code? I think it makes sense to cdefine instance being Parent, but this would be impossible in sage/misc/lazy_attribute.pyx.

If it's a parent-specific support class it can just go with the parent source, right?

And the third question is: Shall we really make this ticket depend on a proper cythonized version of lazy_attribute?

Not if you go the Parent-specific CachedAttribute route. That would be a bad name, by the way. Perhaps ParentAttributeBacker is better.

all our homsets are Python subclasses of the (now cythonized) Homset.

Talk about academic code then ... I'm sure we'll get cythonized homset instances at some point.

comment:57 in reply to: ↑ 56 Changed 6 years ago by SimonKing

Replying to nbruin:

Note that I tested to cythonize lazy_attribute and to cimport Parent. It failed (circular import or so).

That can't be lazy_attribute, or at least not my version. It doesn't have any imports!

sage.structure.parent imports lazy_attribute, and (to my surprise, I must say) cimporting Parent in sage.misc.lazy_attribute did crash.

So, we come back to the first question: Where shall we put the code? I think it makes sense to cdefine instance being Parent, but this would be impossible in sage/misc/lazy_attribute.pyx.

If it's a parent-specific support class it can just go with the parent source, right?

Hey, that's a good idea!

Not if you go the Parent-specific CachedAttribute route. That would be a bad name, by the way. Perhaps ParentAttributeBacker is better.

ParentNonDataDescriptor?

comment:58 follow-up: Changed 6 years ago by SimonKing

I think I will be able to provide a third patch with the non data descriptor tonight.

comment:59 in reply to: ↑ 58 ; follow-up: Changed 6 years ago by SimonKing

Replying to SimonKing:

I think I will be able to provide a third patch with the non data descriptor tonight.

That said: Could we perhaps make this depend on #11935?

comment:60 in reply to: ↑ 59 ; follow-up: Changed 6 years ago by nbruin

Replying to SimonKing:

That said: Could we perhaps make this depend on #11935?

You're doing the work, so you decide. Looking at the tickets, it seems to me this one is less invasive and hence easier to get in: It's already passing doctests; the only thing left is some backstop code that currently won't be exercised anyway.

comment:61 in reply to: ↑ 60 Changed 6 years ago by SimonKing

Replying to nbruin:

Replying to SimonKing:

That said: Could we perhaps make this depend on #11935?

You're doing the work, so you decide. Looking at the tickets, it seems to me this one is less invasive and hence easier to get in: It's already passing doctests; the only thing left is some backstop code that currently won't be exercised anyway.

Yes. And in addition, when I tried to apply a dependency of #11935, I got several errors. So, let's first fix this, and then care about #12876 and finally #11935.

comment:62 follow-up: Changed 6 years ago by nthiery

For the record: #12876 and #11935 are applying smoothly on 5.10 beta4 *and* are passing all tests. And have been in needs review for one year and rebased a couple times. And much depend on them.

So I don't really care about the order, but it would be nice to have them in shortly. And I am bit lazy to rebase them once again.

comment:63 in reply to: ↑ 62 ; follow-up: Changed 6 years ago by SimonKing

Replying to nthiery:

For the record: #12876 and #11935 are applying smoothly on 5.10 beta4 *and* are passing all tests.

OK, if that's the case then these go in first, because

  1. they add features, while this is just about potential speed gain in future
  2. they are more likely to break if we wait any longer.

comment:64 in reply to: ↑ 63 Changed 6 years ago by nthiery

Replying to SimonKing:

Replying to nthiery:

For the record: #12876 and #11935 are applying smoothly on 5.10 beta4 *and* are passing all tests.

OK, if that's the case then these go in first, because

  1. they add features, while this is just about potential speed gain in future
  2. they are more likely to break if we wait any longer.

Thanks Simon, I appreciate that; I know the work it costs you!

comment:65 in reply to: ↑ 56 Changed 6 years ago by nbruin

Replying to nbruin:

Replying to SimonKing:

And I guess you could get it still a bit faster, if you do def __get__(self, Parent instance, cls), because our homsets are parents, and because the cdef public attribute __cached_methods could be accessed more quickly if the instance is known to be a Parent.

Excellent idea! if __cached_method isn't in a slot then we wouldn't want to use it anyway. It does mean that we lose flexibility: If for some reason, we have a class that can't inherit directly from Parent but does want to cooperate in this protocol, it won't be able to.

Note that _element_class on Parent is a @lazy_attribute. That code could benefit from a Parent instance too, so perhaps you DO want to use an (adapted form of) @lazy_attribute for both. In the case of _domain and _codomain, the wrapped function would be irrelevant.

comment:66 Changed 6 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:67 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:68 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:69 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:70 Changed 4 years ago by SimonKing

TODO: Provide a branch...

comment:71 in reply to: ↑ 1 Changed 4 years ago by SimonKing

Replying to SimonKing:

sage.modular.abvar.homspace.EndomorphimSubring inherits from sage.categories.homset.Homset and from sage.rings.ring.Ring. Making Homset cdef with cdef attributes results in incompatible base classes.

  • One could think of creating a new cdef class sage.categories.homset.EndomorphismRing inheriting from sage.rings.ring.Ring and copying the methods implemented in sage.categories.homset.Homset, and use it in sage.modular.abvar.homspace.
    • However, that's code duplication, and I doubt that we want that.
  • One could think of dropping sage.rings.ring.Ring and let EndomorphismSubring entirely rely on the ring functionality that is provided by the category framework.

Note that this should now be a lot easier, by #18756.

  • I tried, but this means opening a can of worms. One would then also need to implement the new coercion model in sage.modular. But replacing EndomorphismSubring.__call__ by an _element_constructor_ does not work, because Homset has a __call__ method, too. Hence, we would then also need to use the new coercion model for sage.categories.homset. Yes, that's desirable. But it is not available in short time.

See #14279, but both this ticket and #14279 need work.

Short term improvements suggested in my patch

  • Accessing a Python attribute _domain defined for a Homset can be surprisingly slow. It is fast, if one has an instance of Homset itself. But it is slow if one considers an instance of a sub-class, such as EndomorphismSubring. It turns out that calling a cached method is faster than accessing the attribute! Hence, I suggest using cached method decorators on the domain() and codomain() methods (these methods are internally used in EndomorphismSubring anyway).

If I recall correctly, that's not the case any longer. However, if we are able to avoid double inheritance from Homset and Ring, there is not obstruction for making _domain and _codomain cdef.

comment:72 Changed 4 years ago by SimonKing

  • Dependencies changed from #14159, #12951, #13184 to #14279

comment:73 Changed 4 years ago by SimonKing

  • Work issues How to store domain and codomain? deleted

Since the patches have bit-rotted for so long, I guess it makes sense to start from scratch. Most part of the discussion here was about different ways to store the _domain and _codomain. Since it should now be possible to have fast ring-behaviour defined via category framework, I suppose one can simply make them cdef attributes, which should be the fastest solution.

comment:74 Changed 4 years ago by SimonKing

  • Dependencies changed from #14279 to #14279 #18756

comment:75 Changed 4 years ago by SimonKing

  • Dependencies changed from #14279 #18756 to #14279 #18758

I confused two tickets. In fact, it is #18758 which makes it possible to define a ring structure via category framework without too much slow-down.

Note: See TracTickets for help on using tickets.