Opened 3 years ago
Closed 3 years ago
#30074 closed enhancement (fixed)
Speedups for symbolic assumptions
Reported by: | mkoeppe | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | sage-9.2 |
Component: | symbolics | Keywords: | |
Cc: | egourgoulhon, slelievre | Merged in: | |
Authors: | Matthias Koeppe | Reviewers: | Travis Scrimshaw, Markus Wageringel, Nils Bruin |
Report Upstream: | N/A | Work issues: | |
Branch: | 4adcb20 (Commits, GitHub, GitLab) | Commit: | 4adcb204aca95916a01535ba4c3a98ec0d13d39d |
Dependencies: | #30065 | Stopgaps: |
Description (last modified by )
As a follow-up from #30065, we cache GenericDeclaration
,
maintain the current assumptions as a dictionary instead of a list,
and rework GenericDeclaration.assume
so it reuses maxima contexts.
The examples from #30065 become near-instantaneous.
The example from #30061, previously 32.8 s, becomes much faster:
sage: %time EuclideanSpace(80) CPU times: user 1.76 s, sys: 194 ms, total: 1.95 s Wall time: 1.63 s 80-dimensional Euclidean space E^80
Also ./sage -btp src/sage/symbolic/ src/sage/manifolds/ src/sage/calculus/
shows modest improvements.
Change History (23)
comment:1 Changed 3 years ago by
Status: | new → needs_review |
---|
comment:2 Changed 3 years ago by
Branch: | → u/mkoeppe/even_faster_maxima_assumptions |
---|
comment:3 follow-up: 4 Changed 3 years ago by
Commit: | → cda81ff3e8a54f4ec09fecd04a0673866736940a |
---|
comment:4 follow-up: 8 Changed 3 years ago by
Replying to tscrim:
I wonder a little bit if
OrderedDict
is really the best data structure. It feels like it really should be aset
, but I guessOrderedDict
has more guaranteed consistency across different sessions. Note that iterating over aset
is actually faster:
Right, I actually would have wanted an OrderedSet
here except that there is no such thing in the standard library.
Ordered because I wanted to preserve the behavior of the previous code using lists to reduce the potential for changes from the order of operations. In this untyped symbolics business, it's best to tread very carefully.
The small constant factor between set
and OrderDict
does not matter much, I would think.
comment:5 Changed 3 years ago by
Reviewers: | → Travis Scrimshaw |
---|---|
Status: | needs_review → positive_review |
Then let it be so.
comment:7 follow-up: 12 Changed 3 years ago by
Status: | positive_review → needs_info |
---|
Perhaps this point can be cleared quickly, but I think it deserves addressing. It seems here that "UniqueRepresentation?" is used just for caching behaviour here. That's not what UniqueRepresentation? is for!! It imposes very specific semantics! If it's just the caching, perhaps you're looking for "CachedRepresentation?".
In either case, the caching comes with a heavy price to pay: the construction parameters used for CachedRepresentation? and UniqueRepresentation? get global references to them for the lifetime of the created object. This can create very non-obvious memory leaks. For instance, you have to make absolutely sure that the construction parameters never support a reference chain pointing to the created object. Furthermore, you have to make absolutely sure that the created object is immutable, because unrelated pieces of code can end up sharing a reference to the object.
comment:8 follow-up: 11 Changed 3 years ago by
Replying to mkoeppe:
Replying to tscrim:
I wonder a little bit if
OrderedDict
is really the best data structure. It feels like it really should be aset
, but I guessOrderedDict
has more guaranteed consistency across different sessions. Note that iterating over aset
is actually faster:Right, I actually would have wanted an
OrderedSet
here except that there is no such thing in the standard library.
As of Python 3.6.x, regular dicts preserve the insertion order, so in most cases OrderedDict
is not needed anymore.
comment:9 Changed 3 years ago by
Considering that the inputs are a symbolic variable and a string, I would be very surprised if any memory leaks were ever created.
However, one benefit is the equality checks and hashing becomes really quick for the UniqueRepresentation
.
comment:10 Changed 3 years ago by
Commit: | cda81ff3e8a54f4ec09fecd04a0673866736940a → 642836dc080f10de1da5a502ccb8bad19113c46b |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
642836d | sage.symbolic.assumptions: Use dict instead of OrderedDict for _assumptions
|
comment:11 Changed 3 years ago by
Replying to gh-mwageringel:
As of Python 3.6.x, regular dicts preserve the insertion order, so in most cases
OrderedDict
is not needed anymore.
Thanks, done.
comment:12 Changed 3 years ago by
Replying to nbruin:
Perhaps this point can be cleared quickly, but I think it deserves addressing. It seems here that
UniqueRepresentation
is used just for caching behaviour here. That's not whatUniqueRepresentation
is for!! It imposes very specific semantics! If it's just the caching, perhaps you're looking forCachedRepresentation
.In either case, the caching comes with a heavy price to pay: the construction parameters used for
CachedRepresentation
andUniqueRepresentation
get global references to them for the lifetime of the created object. This can create very non-obvious memory leaks. For instance, you have to make absolutely sure that the construction parameters never support a reference chain pointing to the created object. Furthermore, you have to make absolutely sure that the created object is immutable, because unrelated pieces of code can end up sharing a reference to the object.
Thanks for the discussion. This is an important point.
I am using UniqueRepresentation
here in the same way as I would use INTERN
in Lisp.
Note that SR
variable names already have indefinite lifetime: SR.var
adds them to the dictionary SR.symbols
; even reset()
does not remove them.
Before this ticket, a GenericDeclaration
is associated with a Maxima context after calling .assume()
for the first time. The context is never killed or collected. Repeated assumptions as illustrated in the examples in #30065 lead to unbounded growth and slowdown.
In this ticket, by using UniqueRepresentation
for GenericDeclaration
, I manage the resource "Maxima context" explicitly; it's not just caching. Effectively I give indefinite lifetime to pairs of an SR
variable and a declaration - this matches the indefinite lifetime of Maxima contexts that was already present. But now repeated assumptions no longer lead to growth.
comment:13 Changed 3 years ago by
Status: | needs_info → needs_review |
---|
comment:15 Changed 3 years ago by
Reviewers: | Travis Scrimshaw → Travis Scrimshaw, Markus Wageringel, Nils Bruin |
---|
comment:16 Changed 3 years ago by
Commit: | 642836dc080f10de1da5a502ccb8bad19113c46b → 9e0cbed2ca26f85160ad363c2b298cb109822ffe |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
9e0cbed | src/sage/symbolic/assumptions.py: Fix style warning
|
comment:17 Changed 3 years ago by
Nils, are you okay with the current version and justifications? If so, then I would set a positive review.
comment:18 Changed 3 years ago by
It's me again, complaining about pyflakes warnings, again. Due to the new use of UniqueRepresentation
, the import of SageObject
is not necessary anymore.
comment:19 Changed 3 years ago by
Commit: | 9e0cbed2ca26f85160ad363c2b298cb109822ffe → 4adcb204aca95916a01535ba4c3a98ec0d13d39d |
---|
Branch pushed to git repo; I updated commit sha1. New commits:
4adcb20 | src/sage/symbolic/assumptions.py: Remove unused import
|
comment:20 Changed 3 years ago by
Status: | needs_review → positive_review |
---|
Green bot, so I am setting a positive review.
comment:22 Changed 3 years ago by
Cc: | slelievre added |
---|---|
Description: | modified (diff) |
Modifying ticket description which still mentioned OrderedDict
.
comment:23 Changed 3 years ago by
Branch: | u/mkoeppe/even_faster_maxima_assumptions → 4adcb204aca95916a01535ba4c3a98ec0d13d39d |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
I wonder a little bit if
OrderedDict
is really the best data structure. It feels like it really should be aset
, but I guessOrderedDict
has more guaranteed consistency across different sessions. Note that iterating over aset
is actually faster:Although this micro-optimization probably doesn't matter too much.
New commits:
sage.symbolic.assumptions.GenericDeclaration.assume: Validate against cached valid features first
sage.symbolic.assumptions.GenericDeclaration: Make it a UniqueRepresentation
sage.symbolic.assumptions.GenericDeclaration.assume: Check first if already in _assumptions
sage.symbolic.assumptions._assumptions: Change from list to OrderDict
sage.symbolic.assumptions.GenericDeclaration: Make it hashable by inheriting comparisons from UniqueRepresentation