Opened 3 years ago
Closed 3 years ago
#23851 closed defect (fixed)
Fix memoryleak introduced in #11670
Reported by:  nbruin  Owned by:  

Priority:  major  Milestone:  sage8.3 
Component:  memleak  Keywords:  thursdaysbdx 
Cc:  Merged in:  
Authors:  Nils Bruin, Peter Bruin  Reviewers:  Sébastien Labbé 
Report Upstream:  N/A  Work issues:  
Branch:  b7e1042 (Commits)  Commit:  b7e1042ab0f3948aa2416c03ebe7c70701418dc3 
Dependencies:  Stopgaps: 
Description (last modified by )
In #11670 the following leak was introduced:
sage: u=id(NumberField(x^25,'a').absolute_field('b')) sage: import gc sage: gc.collect() #random 10 sage: [id(v) for v in gc.get_objects() if id(v) == u]
See Nils's explanation in comment:14:ticket:23807 and the surrounding discussion, where this bug was found.
Change History (33)
comment:1 Changed 3 years ago by
 Branch set to u/nbruin/fix_memoryleak_introduced_in__11670
comment:2 Changed 3 years ago by
 Cc pbruin added
 Commit set to bf22de066dacc686296cf17c0f098484dddf1f1e
 Component changed from PLEASE CHANGE to memleak
 Description modified (diff)
 Status changed from new to needs_review
 Type changed from PLEASE CHANGE to defect
comment:3 Changed 3 years ago by
 Branch changed from u/nbruin/fix_memoryleak_introduced_in__11670 to u/pbruin/23851memory_leak
 Cc pbruin removed
 Commit changed from bf22de066dacc686296cf17c0f098484dddf1f1e to 4568007a9d08ccaa744c1f9b1832ea351eb5ccac
Cleaned up the docstrings of absolute_field()
a bit.
comment:4 followup: ↓ 10 Changed 3 years ago by
 Keywords thursdaysbdx added
 Reviewers set to Sébastien Labbé
 Status changed from needs_review to positive_review
I do not know what is happening with the patchbot. The bug is fixed and I get All tests passed
after running make ptestlong
on my machine.
comment:5 Changed 3 years ago by
 Status changed from positive_review to needs_work
I'm getting this on various buildbots:
File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1389, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.S_units Failed example: u,o = K.S_units([])[0]; u, o Expected: (1/2*a + 1/2, 6) Got: (1/2*a + 1/2, 6) ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1395, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.S_units Failed example: u^2 Expected: 1/2*a  1/2 Got: 1/2*a  1/2 ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1404, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.S_units Failed example: L.S_units([]) Expected: [(1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] Got: [(1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1408, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.S_units Failed example: L.S_units([K.ideal(1/2*a  3/2)]) Expected: [((1/6*a  1/2)*b^2 + (1/3*a  1)*b + 4/3*a, +Infinity), (1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] Got: [((1/6*a  1/2)*b^2 + (1/3*a  1)*b + 4/3*a, +Infinity), (1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1413, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.S_units Failed example: L.S_units([K.ideal(2)]) Expected: [((1/2*a  1/2)*b^2 + (a + 1)*b + 3, +Infinity), ((1/6*a + 1/2)*b^2 + (1/3*a + 1)*b  5/6*a + 1/2, +Infinity), ((1/6*a + 1/2)*b^2 + (1/3*a + 1)*b  5/6*a  1/2, +Infinity), (1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] Got: [((1/2*a  1/2)*b^2 + (a + 1)*b + 3, +Infinity), ((1/6*a + 1/2)*b^2 + (1/3*a + 1)*b  5/6*a + 1/2, +Infinity), ((1/6*a + 1/2)*b^2 + (1/3*a + 1)*b  5/6*a  1/2, +Infinity), (1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1476, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.units Failed example: u = K.units()[0][0]; u Expected: 1/2*a + 1/2 Got: 1/2*a + 1/2 ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1482, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.units Failed example: u^2 Expected: 1/2*a  1/2 Got: 1/2*a  1/2 ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1494, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.units Failed example: L.units() Expected: [(1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] Got: [(1/2*a + 1/2, 6), ((1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, +Infinity), (2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2, +Infinity)] ********************************************************************** File "src/sage/rings/polynomial/polynomial_quotient_ring.py", line 1503, in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.units Failed example: L.unit_group().gens_values() Expected: [1/2*a + 1/2, (1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, 2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2] Got: [1/2*a + 1/2, (1/3*a  1)*b^2  4/3*a*b  5/6*a + 7/2, 2/3*a*b^2 + (2/3*a  2)*b  5/6*a  7/2] ********************************************************************** 2 items had failures: 5 of 18 in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.S_units 4 of 23 in sage.rings.polynomial.polynomial_quotient_ring.PolynomialQuotientRing_generic.units [448 tests, 9 failures, 6.99 s]
comment:6 Changed 3 years ago by
With the following change (which is probably a bad idea) I consistently get the same doctest failures as above:

src/sage/rings/polynomial/polynomial_quotient_ring.py
a b class PolynomialQuotientRing_generic(CommutativeRing): 1050 1050 return self(self.polynomial_ring().random_element( \ 1051 1051 degree=self.degree()1, *args, **kwds)) 1052 1052 1053 @cached_method1054 1053 def _S_decomposition(self, S): 1055 1054 """ 1056 1055 Compute the decomposition of self into a product of number fields.
comment:7 Changed 3 years ago by
 Commit changed from 4568007a9d08ccaa744c1f9b1832ea351eb5ccac to 55a59ff3da2c50c83c23d5021a127026f811f0cd
Branch pushed to git repo; I updated commit sha1. New commits:
55a59ff  Trac 23851: force garbage collection to make Sunit computations deterministic

comment:8 Changed 3 years ago by
The above commit appears to make the computation deterministic again. Let's see if the patchbot agrees.
comment:9 Changed 3 years ago by
 Status changed from needs_work to needs_review
comment:10 in reply to: ↑ 4 Changed 3 years ago by
Replying to slabbe:
I do not know what is happening with the patchbot. The bug is fixed and I get
All tests passed
after runningmake ptestlong
on my machine.
I confirm that I get the same errors when run alone:
sage t src/sage/rings/polynomial/polynomial_quotient_ring.py
I am sorry, I do not know what happened with my run of make ptestlong
which said All tests passed
. I rechecked the log file of ptestlong.log, and I confirm that I got All tests passed
:
Running doctests with ID 201709141616010e639740. Git branch: 23851 Using optional=ccache,cmake,cryptominisat,mpir,pandoc_attributes,pandocfilters,python2,rst2ipynb,sage Doctesting entire Sage library. Sorting sources by runtime so that slower doctests are run first.... Doctesting 3587 files using 8 threads. ... sage t long warnlong 58.9 src/sage/rings/polynomial/polynomial_quotient_ring.py [448 tests, 3.09 s] ...  All tests passed!  Total time for all tests: 1623.2 seconds cpu time: 10208.6 seconds cumulative wall time: 12523.8 seconds
comment:11 Changed 3 years ago by
+ First, we force garbage collection to make the `S`unit + computations below deterministic::
Why? Can you add a sentence in this paragraph saying why the computatino of S
unit depends on the existence of objects that were not yet collected by the garbage collector?
comment:12 Changed 3 years ago by
I confirm that the last commit fixes the sage t of polynomial_quotient_ring.py
when run alone.
comment:13 Changed 3 years ago by
What is apparently happening is that commit bf22de0 (see comment:2) causes the quadratic field of discriminant 3 to be garbagecollected more often. This is in principle a good thing, but every time the field is constructed again, the computed generator of the unit group may be different than the previous time. This also happens if you run
K = nfinit(y^2 + 3); nfbasistoalg(K, nfrootsof1(K)[2])
several times in GP; the result varies randomly between 1/2*y + 1/2
and 1/2*y + 1/2
, the two roots of unity of order 6.
Commit 55a59ff (see comment:7) reduces the randomness in the number of times that K
is constructed. Experimentally it seems to work; I ran doctests in polynomial_quotient_ring.py
128 times with this commit applied and got no failures. However, I now realise that it may not be entirely watertight and that we should check if there are other points at which K
may be randomly garbagecollected. Alternatively, we could construct K
once at the beginning of the file, keeping it alive by binding it to a name that is not used anywhere else in the file.
comment:14 Changed 3 years ago by
With the recent commit, when running make ptestlong
or when running sage t long
alone on the file, I get problems:
 sage t long warnlong 58.5 src/sage/rings/polynomial/polynomial_quotient_ring.py # 9 doctests failed 
When running sage t
alone on the file without the option long
, I get All tests passed
:
Doctesting 1 file. sage t warnlong 32.7 src/sage/rings/polynomial/polynomial_quotient_ring.py [448 tests, 1.81 s]  All tests passed! 
But the failing tests are not marked with long time
... ??? Why are they not failing in the second case?
comment:15 Changed 3 years ago by
 Status changed from needs_review to needs_work
 Work issues set to fix random doctest failures
This seems to indicate that those tests are still nondeterministic. We should either make them deterministic again or mark them with # random
(but then it is harder to notice in case the output actually becomes wrong at some point.)
comment:16 Changed 3 years ago by
These new changes are just showing that the doctests were fragile to begin with. There's a bunch of things you can do to check the result is correct:
 look at the norms
 verify they and their inverses are Sintegral (factorize the denominator)
 hardcode an Sunit basis and check that these occur as a power product of the computed set.
It's a pain, but I don't think we really have another option. (I guess the answers only differ in some sign choices, so you could just construct the set of all sign choices and check your answer lies in there. That would be a reasonably strict test and will probably take a little longer before it breaks again)
Also, if you want to prevent the field from being garbage collected, just bind it explicitly to an identifier. That's how users should prevent garbage collections too. That doesn't help the nondeterministic doctests, though it may hide the problem for a while.
comment:17 Changed 3 years ago by
 Commit changed from 55a59ff3da2c50c83c23d5021a127026f811f0cd to dd0594d391a365ff8d6d53c56767178c7d910903
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
dd0594d  Trac 23851: accept both roots of unity of order 6 in doctests

comment:18 Changed 3 years ago by
 Status changed from needs_work to needs_review
The above commit is a more robust alternative to the gc.collect()
solution; we now just avoid distinguishing between the two generators for the unit group of QuadraticField(3)
.
comment:19 followup: ↓ 20 Changed 3 years ago by
Do we really want to delete @cached_method
on absolute_field
?
comment:20 in reply to: ↑ 19 ; followup: ↓ 23 Changed 3 years ago by
Replying to roed:
Do we really want to delete
@cached_method
onabsolute_field
?
One argument for caching a method is when it takes long time to compute which is not the case here:
sage: %time nf = NumberField(x^25,'a') CPU times: user 1.12 ms, sys: 97 µs, total: 1.21 ms Wall time: 962 µs sage: %time nf.absolute_field('b') CPU times: user 9 µs, sys: 1e+03 ns, total: 10 µs Wall time: 13.8 µs Number Field in b with defining polynomial x^2  5
Is there another reason why absolute_field
should be cached?
comment:21 Changed 3 years ago by
 Status changed from needs_review to positive_review
 Work issues fix random doctest failures deleted
comment:22 Changed 3 years ago by
 Milestone changed from sage8.1 to sage8.3
comment:23 in reply to: ↑ 20 Changed 3 years ago by
Replying to slabbe:
Replying to roed:
Do we really want to delete
@cached_method
onabsolute_field
?One argument for caching a method is when it takes long time to compute which is not the case here:
sage: %time nf = NumberField(x^25,'a') CPU times: user 1.12 ms, sys: 97 µs, total: 1.21 ms Wall time: 962 µs sage: %time nf.absolute_field('b') CPU times: user 9 µs, sys: 1e+03 ns, total: 10 µs Wall time: 13.8 µs Number Field in b with defining polynomial x^2  5
Your timings, on a quadratic field which is already absolute, aren't particularly convincing that this method is always fast. However, running some other tests on higher degree relative extensions shows that they're quite fast as well. I'll withdraw my objection to removing the cached_method
.
Is there another reason why
absolute_field
should be cached?
comment:25 Changed 3 years ago by
 Commit changed from dd0594d391a365ff8d6d53c56767178c7d910903 to b7e1042ab0f3948aa2416c03ebe7c70701418dc3
Branch pushed to git repo; I updated commit sha1. New commits:
b7e1042  Merge branch 'develop' into ticket/23851memory_leak

comment:26 Changed 3 years ago by
 Status changed from needs_work to positive_review
comment:27 followup: ↓ 30 Changed 3 years ago by
I don't really understand the memory leak. If the original code leaks, isn't that a deeper problem with @cached_method
?
comment:28 Changed 3 years ago by
 Status changed from positive_review to needs_info
comment:29 Changed 3 years ago by
Does anybody have an idea? If not, please say so. I'm just asking to know whether it's worth my time to investigate it.
comment:30 in reply to: ↑ 27 Changed 3 years ago by
Replying to jdemeyer:
I don't really understand the memory leak. If the original code leaks, isn't that a deeper problem with
@cached_method
?
The leak is caused by the combination of @cached_method
and UniqueFactory
; see Nils's explanation in comment:14:ticket:23807 and the surrounding discussion, where this bug was found.
comment:31 Changed 3 years ago by
 Description modified (diff)
comment:32 Changed 3 years ago by
 Status changed from needs_info to positive_review
Thanks for the pointer! It all makes sense to me now.
comment:33 Changed 3 years ago by
 Branch changed from u/pbruin/23851memory_leak to b7e1042ab0f3948aa2416c03ebe7c70701418dc3
 Resolution set to fixed
 Status changed from positive_review to closed
New commits:
amend caching of "absolute_field" to avoid memory leaks due to "structure" storage.