Opened 3 years ago
Last modified 3 months ago
#27536 new defect
Conversion of mathematical constant such as pi to RDF leaks memory
Reported by: | rburing | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | symbolics | Keywords: | RDF, Constant, pi, pynac |
Cc: | rws, slelievre | Merged in: | |
Authors: | Reviewers: | ||
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
As reported in Ask SageMath question #45863:
import gc for x in xrange(10^5,10^5+100): s = sum(RDF(pi) for n in xrange(x)) del(s) print "memory usage: " + str(get_memory_usage()) + ", gc: " + str(gc.collect())
The same happens with the other mathematical constants.
The problem does not occur when RDF(pi)
is replaced by e.g. RDF.pi()
.
My guess based on trace("RDF(pi)")
is that it has something to do with caching.
Change History (10)
comment:1 Changed 3 years ago by
- Component changed from memleak to symbolics
- Keywords pynac added
comment:2 Changed 3 years ago by
- Cc rws slelievre added
comment:3 Changed 3 years ago by
- Milestone sage-8.8 deleted
As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).
comment:4 follow-up: ↓ 5 Changed 19 months ago by
This issue is still present in version 9.2 and seems like a major bug, much more general than just to RDF.
Repeatedly using the constant pi rapidly leaks memory. As a test, running the simple loop below on CoCalc? steadily consumes about 15 MB/sec until memory is exhausted.
while(true): if pi>3: x=0
comment:5 in reply to: ↑ 4 Changed 18 months ago by
Replying to gh-agreatnate:
while(true): if pi>3: x=0
I suspect this is indeed the same problem. Now, we're seeing intervalfield elements pile up and a positive thing: because they have pointers themselves, the GC actually tracks them, so we see them on the python heap! (RDF don't need to be tracked by the garbage collector; ref counting does the trick for them, because they can't generate cycles. Neither can RIF elements, but python doesn't know that. Both intervalfield and complex elements carry two pointers to real numbers. So I guess that's why we're seeing them, but not the realnumbers themselves, because they are leaves in the reference tree, as far as Python knows)
I get:
sage: import gc ....: from collections import Counter ....: if pi < 3: ....: pass ....: gc.collect() ....: pre={id(a) for a in gc.get_objects()} ....: for n in [1..2000]: ....: if pi < 3: ....: pass ....: gc.collect() ....: T=Counter(str(type(a)) for a in gc.get_objects() if id(a) not in pre) ....: 193 0 sage: T Counter({"<class 'dict'>": 4000, "<class 'sage.rings.real_mpfi.RealIntervalFieldElement'>": 4000, "<class 'sage.rings.complex_number.ComplexNumber'>": 3999, "<class 'tuple'>": 5, "<class 'list'>": 2, "<class '_ast.Interactive'>": 1, "<class 'function'>": 1, "<class 'set'>": 1})
Backtracking these elements on the heap gets you nothing, though! So as far as I can see, there are no other python heap opjects that hold references to these objects. An incref/decref disparity somewhere would cause that. Alternatively, links are being held outside the python heap (that's not illegal! They could be on the C stack or C heap) It's surprising to see "dict" leak in addition to mpfi elements.
comment:6 Changed 18 months ago by
Possibly it's in this stuff:
https://github.com/pynac/pynac/blob/7ba3e4580e21e05d9272988e0aa94ed0b218ea19/ginac/ex.cpp#L1129
The incref here is probably sometimes required, and perhaps always. It does seem to match with:
https://github.com/pynac/pynac/blob/7ba3e4580e21e05d9272988e0aa94ed0b218ea19/ginac/numeric.cpp#L863 https://github.com/pynac/pynac/blob/7ba3e4580e21e05d9272988e0aa94ed0b218ea19/ginac/numeric.cpp#L916
On the other hand, here:
https://github.com/pynac/pynac/blob/7ba3e4580e21e05d9272988e0aa94ed0b218ea19/ginac/numeric.cpp#L770
it seems the reference is NOT stolen (as is done in numeric types elsewhere). So perhaps that incref simply needs to be removed.
comment:7 Changed 18 months ago by
which of the two increfs should go? in ex.cpp, or in numeric.cpp?
comment:8 Changed 18 months ago by
- Report Upstream changed from N/A to Reported upstream. No feedback yet.
I've opened https://github.com/pynac/pynac/issues/359
comment:9 Changed 18 months ago by
Could #19363 be a related issue?
comment:10 Changed 3 months ago by
- Report Upstream changed from Reported upstream. No feedback yet. to N/A
This issue is still present in Version 9.5 and, now that pynac has been merged into Sage, it is an issue in Sage rather than upstream.
It's not a leak on the python heap (gc.get_objects() doesn't report anything new), so it's probably not caching, but a memory leak in some library code that gets used.
Changing
RDF
toRR
orRIF
also leaks; usingfloat
does NOT leak. That might help a bit in pinning down where the leak occurs.Drilling down yields that
pi._convert({'parent':float})
does not leak andpi._convert({'parent':RDF})
does.This rather directly leads to
self._gobj.evalf(0, kwds)
, i.e., a call in pynac. That's the most likely candidate for the leak.