Opened 3 years ago

Closed 3 years ago

#27386 closed enhancement (fixed)

Cache gens_dict()

Reported by: mmezzarobba Owned by:
Priority: minor Milestone: sage-8.7
Component: performance Keywords:
Cc: Merged in:
Authors: Marc Mezzarobba Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 99ce74d (Commits, GitHub, GitLab) Commit: 99ce74dbd9ef983f3909e0ce30015c5721f5356d
Dependencies: Stopgaps:

Status badges

Description (last modified by mmezzarobba)

Before:

sage: R.<x,y,z> = ZZ[]
sage: a = GF(3)(2)
sage: %timeit x.subs(x=a)
The slowest run took 10.21 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 74.1 µs per loop

After:

The slowest run took 164.38 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 13.6 µs per loop

Change History (11)

comment:1 Changed 3 years ago by mmezzarobba

  • Authors set to Marc Mezzarobba
  • Branch set to u/mmezzarobba/ticket/27386
  • Commit set to 616a1e727c9bc877efaa4336af58f8015e768ec6
  • Status changed from new to needs_review
  • Type changed from PLEASE CHANGE to enhancement

New commits:

616a1e7cache MPolynomial_libsingular.gens_dict()

comment:2 follow-up: Changed 3 years ago by tscrim

I have a couple of issues with this patch. I am very worried about an infinite loop for this method if you do a subclass of MPolynomial_libsingular, so the super call should be explicit about the class. It is also missing a doctest. However, I don't see the point of only caching this output. Why not cache the function higher up the class chains? We cache gens for all polynomial rings (AFAIK), so it makes sense to do the same for gens_dict (in general).

comment:3 Changed 3 years ago by tscrim

  • Status changed from needs_review to needs_info

comment:4 in reply to: ↑ 2 Changed 3 years ago by mmezzarobba

Replying to tscrim:

I am very worried about an infinite loop for this method if you do a subclass of MPolynomial_libsingular, so the super call should be explicit about the class.

Indeed, thanks!

However, I don't see the point of only caching this output. Why not cache the function higher up the class chains? We cache gens for all polynomial rings (AFAIK), so it makes sense to do the same for gens_dict (in general).

Actually, _defining_names() (the wrapper around gens() used by gens_dict()) is cached by CategoryObject. For most parents, I think gens_dict() isn't too expensive to compute once you have gens(), and isn't that much used anyway. So I thought it would be wasteful to cache it in all cases (but I have no strong objection either).

comment:5 Changed 3 years ago by git

  • Commit changed from 616a1e727c9bc877efaa4336af58f8015e768ec6 to 58068c3b481ac5658b083cdab734c6d57e0fd24c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

58068c3#27386 cache gens_dict()

comment:6 Changed 3 years ago by mmezzarobba

  • Description modified (diff)
  • Status changed from needs_info to needs_review
  • Summary changed from Another micro-optimization to libsingular polynomials to Cache gens_dict()

Okay, let's try caching it at the level of CategoryObject.

comment:7 Changed 3 years ago by tscrim

I think it does make sense to cache it at the highest level. It should be small in memory (at least, I would be shocked if anyone created something with 10000 generators and not expecting to need a lot of memory), and this way all things that use it can benefit.

There is one problem (that I should have noticed earlier and was a problem even with your original proposal): a dict is mutable. So you can really destroy things by mutating the output of gens_dict. A Family is like an immutable dict, but its iteration is over values, not keys. So that is not a backwards compatible change (either). It is more invasive, but perhaps the best thing is to have an internal @lazy_attribute _gens_dict that gets copied for the output of gens_dict.

comment:8 Changed 3 years ago by git

  • Commit changed from 58068c3b481ac5658b083cdab734c6d57e0fd24c to 99ce74dbd9ef983f3909e0ce30015c5721f5356d

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

99ce74d#27386 cache gens_dict()

comment:9 Changed 3 years ago by mmezzarobba

  • Description modified (diff)

Oooops—thank you, and sorry for the stupid mistake! Here's a new attempt. I used @cached_method, not @lazy_attribute, because lazy attributes are computed as soon as you try to access them, and there are parents for which gens() returns infinitely many generators, so that the default gens_dict() loops.

comment:10 Changed 3 years ago by tscrim

  • Reviewers set to Travis Scrimshaw
  • Status changed from needs_review to positive_review

Well, this works well enough. I found I could get another 1% speedup if I did a manual cache and a cdef method called directly (instead of via a Python method. I was expecting actually a bit more. Anyways, LGTM. Thanks.

comment:11 Changed 3 years ago by vbraun

  • Branch changed from u/mmezzarobba/ticket/27386 to 99ce74dbd9ef983f3909e0ce30015c5721f5356d
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.