#14058 closed enhancement (fixed)
Weakly reference binary operation codomains
Reported by:  robertwb  Owned by:  rlm 

Priority:  major  Milestone:  sage6.9 
Component:  memleak  Keywords:  
Cc:  nbruin, SimonKing, jpflori  Merged in:  
Authors:  Robert Bradshaw, Nils Bruin  Reviewers:  Simon King, Frédéric Chapoton, JeanPierre Flori, Sébastien Labbé 
Report Upstream:  N/A  Work issues:  
Branch:  13cb2a7 (Commits)  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
R(1) + S(1) caches a strong reference to both R and S, which we'd like to avoid.
See discussion at https://groups.google.com/forum/?fromgroups=#!topic/sagedevel/ozhIHIwQ4WA
Attachments (4)
Change History (81)
Changed 4 years ago by
comment:1 Changed 4 years ago by
 Cc nbruin SimonKing jpflori added
 Status changed from new to needs_review
comment:2 Changed 4 years ago by
comment:3 Changed 4 years ago by
I like the idea. Can we get some data on speed regressions due to this? In principle the double lookup might be noticeable. I don't mean speed regressions due to parents getting deleted  those are good to know but need other fixes. I mean when the parents are still around.
Attached doctests may need #12313 to work, in which case this ticket should depend on that (because then that's necessary to get benefit from this ticket).
comment:4 Changed 4 years ago by
Sadly, the performance hit is quite noticeable.
a=ZZ(1) b=QQ(1) c=ZZ['x'](1) d=b+c def test(x,y): for i in xrange(10^6): _=x+y
Within a run, timing test
seems fairly consistent. However, between different compiles of the same code I'm observing fluctuations of up to 0.10s
in timings. Lucky/unlucky memory layout?
Anyway, with patch:
sage: %time test(a,b) CPU times: user 2.15 s, sys: 0.00 s, total: 2.15 s Wall time: 2.16 s sage: %time test(a,c) CPU times: user 2.25 s, sys: 0.00 s, total: 2.25 s Wall time: 2.26 s sage: %time test(b,c) CPU times: user 16.83 s, sys: 0.00 s, total: 16.83 s
without patch
sage: %time test(a,b) CPU times: user 1.54 s, sys: 0.00 s, total: 1.54 s Wall time: 1.55 s sage: %time test(a,c) CPU times: user 1.64 s, sys: 0.00 s, total: 1.64 s Wall time: 1.64 s CPU times: user 15.68 s, sys: 0.00 s, total: 15.68 s Wall time: 15.76 s
For test(a,a)
, test(b,b)
, test(c,c)
there doesn't seem to be a difference (luckily!)
The penalties seem to be about 0.6s
, 0.6s
, 1.2s
, which is rather consistent with 1,1,2 extra lookups.
This indicates to me it may be worth trying storing a weakref to the morphism instead, since that can probably be dereferenced faster than a coercion lookup on the codomain. Logically this should be equivalent. We're just storing a weakref to the object we're interested in rather than instructions where to go and lookup the object we want.
Caveat:
For stored coercions of the type
(R,S, ...): (None,<map S>R>)
or
(R,S, ...): (<map R>S>,None)
the codomain is part of the key, so deletion of the codomain (which is a prerequisite for the morphism being deleted and hence the weakref dying) implies removal of the weak key entry in TripleDict
.
However, for stored coercions of the type
(R,S,...): (<map R>C>,<map S>C>)
we could have that C
and the weakrefs to the morphisms die, but that would not trigger the removal from the TripleDict
. So we could still encounter dead weakrefs (but much less than one would perhaps initially think!) It's not really much different from how 14058weakbinop.patch handles dead weakrefs.
Major problem for immediately applying this idea:
sage: M=QQ.coerce_map_from(ZZ) sage: import weakref sage: weakref.ref(M) TypeError: cannot create weak reference to 'sage.rings.rational.Z_to_Q' object
comment:5 Changed 4 years ago by
Yeah, I had exactly this same idea. I'll post an updated patch.
Changed 4 years ago by
comment:6 Changed 4 years ago by
Apply 14058weakbinopmorphisms.patch and trac_14058doctest.patch
comment:7 Changed 4 years ago by
Great! With this new patch I cannot distinguish the timing differences from the noise one normally gets already, so I'd think this is quite acceptable.
comment:8 Changed 4 years ago by
 Dependencies set to #12313
patchbot seems unhappy about the doctest.Probably #12313 is indeed necessary to make parents reclaimable.
comment:9 Changed 4 years ago by
 Description modified (diff)
Stating in the ticket description what patches are to apply. Admittedly I didn't test yet whether the doctest patch applies after Robert's new patch. But the test patch is needed, I think.
Apply 14058weakbinopmorphisms.patch and trac_14058doctest.patch trac_14058doctest.patch
comment:10 Changed 4 years ago by
 Description modified (diff)
comment:11 Changed 4 years ago by
The patches apply, and at least the new test passes. So, this together with #12313, does fix another leak.
comment:12 Changed 4 years ago by
 Description modified (diff)
comment:13 followup: ↓ 14 Changed 4 years ago by
In vanilla Sage, the coercion model stores (coercion) morphisms in its cache (which was a TripleDict
, hence, with weak keys), and the only change introduced by your patch is to store weak references to the (coercion) morphism instead. Did I understand this correctly?
In vanilla Sage, the coercion model kept a strong reference to the morphism, which kept a strong reference to domain and codomain, which were thus not reclaimable, and so the item of the TripleDict
has eternal life, in spite of the weak keys. Correct?
With the patch, the coercion model keeps a weak reference to the coercion morphism, hence, the strong reference from the morphism to domain and codomain doesn't matter, and thus the item of the TripleDict
may disappear.
But how is the morphism kept alive? The coercion model only has a weak reference to it. Do I understand correctly that the morphism involved in the binary operation is, at the same time, stored in the coerce dict of its codomain? That's why it does not immediately die?
In other words, do I understand that the layout is as follows? We have parents A and B, and want to apply some binary operator, with a result in the parent C (C may coincide with either A or B). And we have maps r_A and r_B from A or B to C.
Both r_A and r_B are stored with a strong reference in a cache located in C, with weak keys A and B. At the same time, they are stored with a weak reference in the coercion model, again with weak keys A and B. r_A has strong references to C and to A, r_B has strong references to C and to B.
What do we want? Do we want that keeping C alive makes A and B survive? Or do we want that keeping both A and B alive makes C survive?
If the user has a strong reference to C, then C has a strong reference to r_A and r_B (in its coerce cache), and r_A/r_B strongly refers to A/B. Hence, the existence of C keeps A and B alive. Since C is a complicated object, it is more likely to be mortal, hence, probably it is not much of a problem.
If the user has a strong reference to both A and B, then C keeps a strong reference to both r_A and r_B, and the coercion model keeps a weak reference to r_A and r_B. Moreover, r_A and r_B strongly refer to C.
However, isn't it just a reference cycle between C and r_A/r_B? It would not prevent C from becoming collectable, right?
I doubt that we want that. It is not desirable that, when adding elements of ZZ['x']
and QQ, the ring QQ['x']
is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).
Or did I misunderstand something?
comment:14 in reply to: ↑ 13 ; followup: ↓ 16 Changed 4 years ago by
I doubt that we want that. It is not desirable that, when adding elements of
ZZ['x']
and QQ, the ringQQ['x']
is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).
In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x']
he should ensure his elements already live there. Adding elements with the same parents is orders of magnitude faster than relying on coercion. A step closer is ensure that your parent remains in memory. I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably
In this scenario:
P=[ZZ['x%d'%i] for i in [1..1000]] F=[GF(p) for p in prime_range(1000)] for Zxi in P: for k in F: a=P.0 b=F(1) c=a+b
If we keep the GF(p)[xi]
alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.
I don't think that we can afford to assume that just because a user has combined two parents he/she will do so again in the future. We may be able to afford doing so if it happened recently, but that really is an optimisation, and currently I don't see a hook where we can do this efficiently.
A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.
Another possibility is just UniqueRepresentation_c.__cinit__
. Only when we're making a lot of parents do we care about losing them again ...
The problem with artificially keeping transient elements alive for longer is that you'll defeat the heuristics for generational garbage collection (which Python does), and we depend rather heavily on that to get rid of cyclic garbage. Testing would be required to see if this is indeed a problem, and it will be very applicationdependent.
comment:15 Changed 4 years ago by
Before posting a reply to Nils' post, here is an example that the existence of A and B does not ensure the survival of the pushout of A and B. First, attach testcoercion.py into a Sage session. Then:
sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py sage: OA = A() sage: OB = B() sage: cm = sage.structure.element.get_coercion_model() sage: OC = cm.explain(OA,OB) Coercion on left operand via Conversion map: From: A To: C Coercion on right operand via Conversion map: From: B To: C Arithmetic performed after coercions. Result lives in C sage: OC.is_coercion_cached(OA) True sage: OC.is_coercion_cached(OB) True sage: del OC sage: import gc sage: _ = gc.collect() sage: len([X for X in gc.get_objects() if isinstance(X,C)]) 0 sage: xc = OA(1)*OB(1) sage: isinstance(xc.parent(),C) True sage: del xc sage: _ = gc.collect() sage: len([X for X in gc.get_objects() if isinstance(X,C)]) 0
I do believe that this is not desired behaviour.
comment:16 in reply to: ↑ 14 ; followup: ↓ 17 Changed 4 years ago by
Replying to nbruin:
I doubt that we want that. It is not desirable that, when adding elements of
ZZ['x']
and QQ, the ringQQ['x']
is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with
QQ['x']
he should ensure his elements already live there.
Granted. But it is a very typical usage to not explicitly convert everything into a common parent, but let the coercion model do the work. That is what the coercion model is for!
I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably
However, in the example that I posted above, note that multiplying OA(1)*OB(1)
repeatedly will also create C()
repeatedly, even though it is a UniqueRepresentation
and hence (weakly, nowadays) cached!
I think it is not acceptable that a multiple creation is triggered.
In this scenario:
P=[ZZ['x%d'%i] for i in [1..1000]] F=[GF(p) for p in prime_range(1000)] for Zxi in P: for k in F: a=P.0 b=F(1) c=a+bIf we keep the
GF(p)[xi]
alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.
In your example, GF(p)[xi]
is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.
A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.
Another possibility is just
UniqueRepresentation_c.__cinit__
. Only when we're making a lot of parents do we care about losing them again ...
OK, but not all unique parent structures rely on it.
comment:17 in reply to: ↑ 16 Changed 4 years ago by
Replying to SimonKing:
I think it is not acceptable that a multiple creation is triggered.
... Why not? Should the user expect that doing something a second time is faster than the first, without taking special precautions?
In your example,
GF(p)[xi]
is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.
But how do you tell the difference? Against too little caching there is a simple defense: keep a reference yourself. Against too much caching there is nothing you can do. You just run out of memory because data structures outside your control blow up.
We clearly have too much caching presently, in some cases by design. I think we first need a design that's fairly clean and for which we can reliably reason there's not "too much" caching (changing the order of memory requirement is definitely "too much" for me). Next step is tweaking it, possibly with MRU queues (which in that case should really be investigated for interfering with efficient GC, which is based on "objects either live very short or very long", so artificially creating objects with a medium lifespan is bad)
comment:18 followup: ↓ 19 Changed 4 years ago by
By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:
sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py sage: OA = A() sage: OB = B() sage: a = OA(1) sage: b = OB(1) sage: def test(x,y): ....: for _ in xrange(2*10^5): ....: z = x*y ....: sage: %time test(a,a) CPU times: user 4.49 s, sys: 0.00 s, total: 4.49 s Wall time: 4.49 s sage: %time test(a,b) CPU times: user 8.99 s, sys: 0.11 s, total: 9.11 s Wall time: 9.13 s sage: c = a*b sage: print c.parent() C sage: %time test(a,b) CPU times: user 8.85 s, sys: 0.03 s, total: 8.88 s Wall time: 8.89 s
In the first run of test(a,b), there is the possibility that parent(a*b)
becomes created repeatedly, as I have demonstrated above. In the second run, parent(a*b)
is kept alive, by keeping a pointer to the result of a*b
. There is no significant difference in performance. Of course, we also see that multiplication within one parent is faster.
Is there a way to get the patchbots report significant regressions in some examples? Because I am sure that some parts of Sage (schemes and combinat are typical candidates) rely on a strong coercion cache.
comment:19 in reply to: ↑ 18 Changed 4 years ago by
Replying to SimonKing:
By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:
Ah right, because C._coerce_from_hash[A]._codomain is C
, there's a cycle involving C so instead of getting deleted eagerly by Python's refcounting, it's only done with (relatively infrequent) GCs. So it even comes with a trick to improve this free "caching": Increase the interval for garbage collection. See gc.get_threshold
and gc.set_threshold
. By tweaking those numbers you can probably manipulate the running times in these examples.
NOTE: Doesn't timeit
turn garbage collection off? So that may be a misleading tool for investigating performance for these things.
comment:20 Changed 4 years ago by
I think it makes sense to ask sagedevel whether it is better to fix this leak or not.
I believe that keeping A and B alive should prevent C from garbage collection. In contrast, keeping C alive should not prevent A and B from garbage collection. Currently, it is the other way around. Perhaps we should try to think through whether having a weak reference to the domain of coercion or conversion maps would work.
comment:21 Changed 4 years ago by
+1 to taking this discussion to sagedevel.
Patchbot: apply 14058weakbinopmorphisms.patch trac_14058doctest.patch
comment:22 Changed 4 years ago by
The patchbot only reports one error, namely
File "/home/patchbot/sage5.7.beta3/devel/sage14058/sage/rings/number_field/number_field_ideal.py", line 118: sage: convert_to_idealprimedec_form(K, P) Exception raised: Traceback (most recent call last): File "/home/patchbot/sage5.7.beta3/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/home/patchbot/sage5.7.beta3/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/home/patchbot/sage5.7.beta3/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_2[5]>", line 1, in <module> convert_to_idealprimedec_form(K, P)###line 118: sage: convert_to_idealprimedec_form(K, P) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/rings/number_field/number_field_ideal.py", line 124, in convert_to_idealprimedec_form K_bnf = gp(field.pari_bnf()) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/interfaces/interface.py", line 201, in __call__ return self._coerce_from_special_method(x) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/interfaces/interface.py", line 227, in _coerce_from_special_method return (x.__getattribute__(s))(self) File "sage_object.pyx", line 485, in sage.structure.sage_object.SageObject._gp_ (sage/structure/sage_object.c:4804) File "sage_object.pyx", line 450, in sage.structure.sage_object.SageObject._interface_ (sage/structure/sage_object.c:4144) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/interfaces/interface.py", line 199, in __call__ return cls(self, x, name=name) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/interfaces/expect.py", line 1280, in __init__ self._name = parent._create(value, name=name) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/interfaces/interface.py", line 389, in _create self.set(name, value) File "/home/patchbot/sage5.7.beta3/local/lib/python/sitepackages/sage/interfaces/gp.py", line 511, in set raise TypeError, "Error executing code in GP:\nCODE:\n\t%s\nPARI/GP ERROR:\n%s"%(cmd, out) TypeError: Error executing code in GP: CODE: sage[14]=[[;], matrix(0,7), [;], Mat([0.E38 + 3.52184386028273*I, 0.E37 + 3.70366245659542*I, 0.E37 + 2.91167081258838*I, 0.E38 + 3.14159265358979*I, 0.E38 + 9.04452675407645*I, 0.E37 + 8.86270815776375*I, 0.E37 + 9.65469980177079*I]), [[7, [1, 2]~, 1, 1, [3, 2; 2, 1]], [13, [5, 2]~, 1, 1, [6, 2; 2, 8]], [19, [3, 2]~, 1, 1, [5, 2; 2, 3]], [3, [1, 2]~, 2, 1, [1, 1; 1, 2]], [7, [3, 2]~, 1, 1, [1, 2; 2, 3]], [13, [7, 2]~, 1, 1, [5, 2; 2, 7]], [19, [5, 2]~, 1, 1, [3, 2; 2, 5]]], 0, [y^2 + 3, [0, 1], 3, 2, [Mat([1, 0.500000000000000  0.866025403784439*I]), [1, 1.36602540378444; 1, 0.366025403784439], [1, 1; 1, 0], [2, 1; 1, 1], [3, 2; 0, 1], [1, 1; 1, 2], [3, [2, 1; 1, 1]]], [0.E38  1.73205080756888*I], [1, 1/2*y  1/2], [1, 1; 0, 2], [1, 0, 0, 1; 0, 1, 1, 1]], [[1, [], []], 1, 1, [6, 1/2*y + 1/2], []], [[;], [], []], [0, []]]; PARI/GP ERROR: *** at toplevel: sage[14]=[[;],matrix(0,7 *** ^ *** _[_]: not a vector. *** Warning: precision too low for generators, not given.
That's obscure to me. What does it mean?
comment:23 Changed 4 years ago by
Unfortunately, I get
sage t "devel/sagemain/sage/rings/number_field/number_field_ideal.py" [21.9 s]  All tests passed! Total time for all tests: 22.0 seconds
So, what happened here?
In any case, I think the example of collectable pushout should be included in the documentation, combined with a warning to keep a pointer to the pushout (or avoid crossparent arithmetic altogether), if speed matters.
I could write that. But to what location? Logically, it belongs to sage.structure.coerce, but nobody would read that. It might better be in a more general tutorial. What do you suggest?
comment:24 Changed 4 years ago by
I did make ptest with a highly patched debug version of sage5.7.beta2. Hence, I have the dependencies of #13864 and additionally #9107, #12951, #13916. #12313, #14058 and #13370 applied.
In the parallel test, I got two timeouts, namely in devel/sage/doc/en/bordeaux_2008/birds_other.rst and devel/sage/sage/interfaces/maxima.py. Running the former individually is long, but fine:
sage t force_lib "devel/sage/doc/en/bordeaux_2008/birds_other.rst" [339.8 s]  All tests passed! Total time for all tests: 339.9 seconds
However, the maxima test is suspicious. I ran it twice individually. The first time I got an evil error:
sage t force_lib "devel/sage/sage/interfaces/maxima.py" ********************************************************************** File "/home/simon/SAGE/debug/sage5.7.beta2/devel/sage/sage/interfaces/maxima.py", line 355: sage: T.tlimit('n','inf') Exception raised: Traceback (most recent call last): File "/home/simon/SAGE/debug/sage5.7.beta2/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/home/simon/SAGE/debug/sage5.7.beta2/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/home/simon/SAGE/debug/sage5.7.beta2/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_0[77]>", line 1, in <module> T.tlimit('n','inf')###line 355: sage: T.tlimit('n','inf') File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/interface.py", line 588, in __call__ return self._obj.parent().function_call(self._name, [self._obj] + list(args), kwds) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/interface.py", line 489, in function_call return self.new(s) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/interface.py", line 264, in new return self(code) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/interface.py", line 199, in __call__ return cls(self, x, name=name) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/maxima.py", line 1153, in __init__ ExpectElement.__init__(self, parent, value, is_name=False, name=None) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/expect.py", line 1280, in __init__ self._name = parent._create(value, name=name) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/interface.py", line 389, in _create self.set(name, value) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/maxima.py", line 998, in set self._eval_line(cmd) File "/home/simon/SAGE/debug/sage5.7.beta2/local/lib/python/sitepackages/sage/interfaces/maxima.py", line 754, in _eval_line assert line_echo.strip() == line.strip() AssertionError ********************************************************************** 1 items had failures: 1 of 97 in __main__.example_0 ***Test Failed*** 1 failures. For whitespace errors, see the file /home/simon/.sage//tmp/maxima_30383.py [13.7 s]  The following tests failed: sage t force_lib "devel/sage/sage/interfaces/maxima.py" Total time for all tests: 13.8 seconds
and the second time I got
sage t force_lib "devel/sage/sage/interfaces/maxima.py" [12.6 s]  All tests passed! Total time for all tests: 12.6 seconds
Is the error something we should worry about? Or is that likely to be the effect of a neutrino hitting my laptop?
comment:25 Changed 4 years ago by
comment:26 Changed 4 years ago by
 Milestone changed from sage5.11 to sage5.12
comment:27 Changed 4 years ago by
Note to myself: Work on this ticket...
comment:28 Changed 4 years ago by
Lost track again.
I tried to import the patches and turn them into git branches. It mostly worked, but I had to resolve some conflicts with the second patch.
Would it be OK to change this hgticket into a gitticket?
comment:29 Changed 4 years ago by
 Reviewers set to Simon King
 Status changed from needs_review to needs_work
 Work issues set to Fix doctests; decide between hg and git
While I had to resolve conflicts when importing the two patches into the git master branch, the patches do apply to sage5.12.beta5, and the patchbot reports the same on sage5.13.beta2. So, perhaps it is OK to keep this a mercurial ticket.
However, with the gitification of the patches, I find (similar to the patchbot)
sage t long /mnt/storage2TB/patchbot/Sage/sage5.13.beta2/devel/sage/sage/rings/polynomial/plural.pyx # 17 doctests failed sage t long /mnt/storage2TB/patchbot/Sage/sage5.13.beta2/devel/sage/sage/libs/singular/function.pyx # 1 doctest failed
The patchbot additionally finds
sage t long /mnt/storage2TB/patchbot/Sage/sage5.13.beta2/devel/sage/sage/algebras/free_algebra.py # 1 doctest failed sage t long /mnt/storage2TB/patchbot/Sage/sage5.13.beta2/devel/sage/sage/structure/coerce.pyx # 1 doctest failed
So, this requires some work.
comment:30 Changed 3 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:31 Changed 3 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:32 Changed 3 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:33 Changed 23 months ago by
I plan to create a branch for it. Probably it will fix #18905.
comment:34 Changed 23 months ago by
 Branch set to u/SimonKing/weakly_reference_binary_operation_codomains
comment:35 Changed 23 months ago by
 Commit set to b9141eef1042c8b8fb1179b30844d5bdb737b96e
 Work issues changed from Fix doctests; decide between hg and git to Fix doctests
Here is a branch. The merge was not trivial, so, hopefully I did it well.
Next, we have to see if the "old problem" (i.e., attachment:testcoercion.py) is still problematic, and of course there may be doctest failures.
At least, the issue from comment:20 is solved: We do have weak references to the domain of coercion maps now.
New commits:
ac68d9b  Remove strong references to parents used in binary operations in the coercion model.

e153d08  #14058: Add doctest

b9141ee  Merge branch 'ticket/14058' into develop

comment:36 Changed 23 months ago by
sage t src/sage/algebras/commutative_dga.py # 3 doctests failed sage t src/sage/algebras/free_algebra.py # 1 doctest failed sage t src/sage/rings/polynomial/plural.pyx # 17 doctests failed sage t src/sage/dev/sagedev.py # 2 doctests failed sage t src/sage/structure/coerce.pyx # 2 doctests failed sage t src/sage/libs/singular/function.pyx # 2 doctests failed
Not bad (and the sagedev error very likely is because I have rerere enabled, hence, it is unrelated anyway).
comment:37 Changed 23 months ago by
Details:
sage t src/sage/libs/singular/function.pyx ********************************************************************** File "src/sage/libs/singular/function.pyx", line 413, in sage.libs.singular.function.is_singular_poly_wrapper Failed example: H.<x,y,z> = A.g_algebra({z*x:x*z+2*x, z*y:y*z2*y}) Expected nothing Got: Exception KeyError: (The ring pointer 0x7f7c7470c8d0,) in 'sage.libs.singular.ring.singular_ring_delete' ignored ********************************************************************** File "src/sage/libs/singular/function.pyx", line 1250, in sage.libs.singular.function.SingularFunction.__call__ Failed example: I = Ideal(I.groebner_basis()) Expected nothing Got: Exception KeyError: (The ring pointer 0x7f7c7470ca40,) in 'sage.libs.singular.ring.singular_ring_delete' ignored **********************************************************************
> It seems that the cache became too weak. All errors in sage t src/sage/rings/polynomial/plural.pyx
and sage t src/sage/algebras/commutative_dga.py
are of that form.
sage t src/sage/structure/coerce.pyx ********************************************************************** File "src/sage/structure/coerce.pyx", line 418, in sage.structure.coerce.CoercionModel_cache_maps.get_cache Failed example: print copy(left_morphism_ref()) Exception raised: Traceback (most recent call last): File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 496, in _run self.compile_and_execute(example, compiler, test.globs) File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 858, in compile_and_execute exec(compiled, globs) File "<doctest sage.structure.coerce.CoercionModel_cache_maps.get_cache[4]>", line 1, in <module> print copy(left_morphism_ref()) NameError: name 'left_morphism_ref' is not defined ********************************************************************** File "src/sage/structure/coerce.pyx", line 422, in sage.structure.coerce.CoercionModel_cache_maps.get_cache Failed example: print right_morphism_ref Exception raised: Traceback (most recent call last): File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 496, in _run self.compile_and_execute(example, compiler, test.globs) File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 858, in compile_and_execute exec(compiled, globs) File "<doctest sage.structure.coerce.CoercionModel_cache_maps.get_cache[5]>", line 1, in <module> print right_morphism_ref NameError: name 'right_morphism_ref' is not defined **********************************************************************
> I got something wrong in my merge.
So, we actually just have two errors.
comment:38 Changed 22 months ago by
 Commit changed from b9141eef1042c8b8fb1179b30844d5bdb737b96e to 90ed181ae9df152e99a652636a7e9dc29b466916
Branch pushed to git repo; I updated commit sha1. New commits:
90ed181  Trivial fix for a coercion doctest

comment:39 Changed 22 months ago by
comment:40 Changed 22 months ago by
In the following example
sage: import gc sage: from sage.rings.polynomial.plural import NCPolynomialRing_plural sage: from sage.algebras.free_algebra import FreeAlgebra sage: A1.<x,y,z> = FreeAlgebra(QQ, 3) sage: R1 = A1.g_algebra({y*x:x*yz, z*x:x*z+2*x, z*y:y*z2*y}, order=TermOrder('degrevlex', 2)) sage: A2.<x,y,z> = FreeAlgebra(GF(5), 3) sage: R2 = A2.g_algebra({y*x:x*yz, z*x:x*z+2*x, z*y:y*z2*y}, order=TermOrder('degrevlex', 2)) sage: A3.<x,y,z> = FreeAlgebra(GF(11), 3) sage: R3 = A3.g_algebra({y*x:x*yz, z*x:x*z+2*x, z*y:y*z2*y}, order=TermOrder('degrevlex', 2)) sage: A4.<x,y,z> = FreeAlgebra(GF(13), 3) sage: R4 = A4.g_algebra({y*x:x*yz, z*x:x*z+2*x, z*y:y*z2*y}, order=TermOrder('degrevlex', 2)) sage: _ = gc.collect() sage: foo = R1.gen(0) sage: del foo sage: del R1 sage: _ = gc.collect() sage: del R2 sage: _ = gc.collect() sage: del R3 sage: _ = gc.collect()
the reference counting goes wrong after each gc.collect()
.
comment:41 Changed 22 months ago by
 Commit changed from 90ed181ae9df152e99a652636a7e9dc29b466916 to 64b572ccf6403be0c617d0e39de4b1e6cd5dbc58
Branch pushed to git repo; I updated commit sha1. New commits:
64b572c  refcount libsingular rings used in plural

comment:42 Changed 22 months ago by
Got it! When a noncommutative ring is created, it was forgotten to put the underlying libsingular ring into Sage's refcounting system. Should be fixed now. I'll see if #13447 is fixed now, too.
comment:43 Changed 22 months ago by
 Status changed from needs_work to needs_review
 Work issues Fix doctests deleted
Sigh. #13447 is not fixed, as the following example shows.
sage: import gc sage: _ = gc.collect() sage: from sage.rings.polynomial.multi_polynomial_libsingular import MPolynomialRing_libsingular sage: from sage.libs.singular.groebner_strategy import GroebnerStrategy sage: n = len([x for x in gc.get_objects() if isinstance(x,MPolynomialRing_libsingular)]) sage: P = MPolynomialRing_libsingular(GF(541), 2, ('x', 'y'), TermOrder('degrevlex', 2)) sage: strat = GroebnerStrategy(Ideal([P.gen(0) + P.gen(1)])) sage: del strat sage: del P sage: _ = gc.collect() sage: n == len([x for x in gc.get_objects() if isinstance(x,MPolynomialRing_libsingular)]) False
I'd suggest to proceed as follows. Provided that all tests pass, I can review Robert's and Nils' patches, someone else reviews my additions (which are the merging and the latest commit), and then we make this a dependency for #13447.
comment:44 Changed 22 months ago by
 Commit changed from 64b572ccf6403be0c617d0e39de4b1e6cd5dbc58 to 9793dbc9bc64bfdce43ff1e626453816c302696d
Branch pushed to git repo; I updated commit sha1. New commits:
9793dbc  Make one test more stable

comment:45 followup: ↓ 47 Changed 22 months ago by
Sometimes, the following test in sage/categories/fields.py has failed:
sage: import gc sage: _ = gc.collect() sage: n = len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) sage: for i in prime_range(100): ....: R = ZZ.quotient(i) ....: t = R in Fields() sage: _ = gc.collect() sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])  n 1
The reason for the failure: Sometimes the last line gave 0, not 1. It is not totally clear to me why the different results occurred. But in fact 0 is better than 1 here (one additional ring got collected). So, I hope my change is OK.
comment:46 followup: ↓ 56 Changed 22 months ago by
I am giving a positive review to Robert's code and Nils' doctest. I consider my contribution too small to make me author, but I think my changes need a review.
comment:47 in reply to: ↑ 45 ; followup: ↓ 48 Changed 22 months ago by
Replying to SimonKing:
Sometimes, the following test in sage/categories/fields.py has failed:
sage: import gc sage: _ = gc.collect() sage: n = len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) sage: for i in prime_range(100): ....: R = ZZ.quotient(i) ....: t = R in Fields() sage: _ = gc.collect() sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])  n 1The reason for the failure: Sometimes the last line gave 0, not 1. It is not totally clear to me why the different results occurred. But in fact 0 is better than 1 here (one additional ring got collected).
Shouldn't the result always be 1 given that R
refers to Zmod(97)
after the loop (unless there is a surviving reference from a Zmod(97)
created in some earlier test)? What happens if you do del R
before the last gc.collect()
?
comment:48 in reply to: ↑ 47 Changed 22 months ago by
Replying to pbruin:
Shouldn't the result always be 1 given that
R
refers toZmod(97)
after the loop (unless there is a surviving reference from aZmod(97)
created in some earlier test)?
Exactly. And what's strange: It isn't reproducible. Sometimes I get 1, sometimes I get 0. And I always get 1 when I run it on the command line.
What happens if you do
del R
before the lastgc.collect()
?
No idea.
comment:49 Changed 22 months ago by
To be precise: I saw this problem precisely once. When I run the previous test about 20 times, it always passes.
comment:50 Changed 22 months ago by
 Commit changed from 9793dbc9bc64bfdce43ff1e626453816c302696d to 2c04029f9a194bd87befa0cd99016e2f267a836f
Branch pushed to git repo; I updated commit sha1. New commits:
2c04029  make a test against a memory leak clearer/more stable

comment:51 Changed 22 months ago by
Hopefully the new version of the test is reliable.
comment:52 followup: ↓ 53 Changed 22 months ago by
 Status changed from needs_review to needs_work
excerpt from patchbot report:
+inside file: b/src/sage/structure/coerce.pyx +@@ 1265,33 +1287,74 @@ cdef class CoercionModel_cache_maps(CoercionModel): ++ raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map" ++ raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map" +Oldstyle raise statement inserted on 2 nonempty lines
comment:53 in reply to: ↑ 52 Changed 22 months ago by
Replying to chapoton:
excerpt from patchbot report:
+inside file: b/src/sage/structure/coerce.pyx +@@ 1265,33 +1287,74 @@ cdef class CoercionModel_cache_maps(CoercionModel): ++ raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map" ++ raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map" +Oldstyle raise statement inserted on 2 nonempty lines
Actually this was not inserted, but moved. Would it be possible for you to change this to newstyle raise statement by a reviewer commit?
comment:54 Changed 22 months ago by
 Branch changed from u/SimonKing/weakly_reference_binary_operation_codomains to public/ticket/14058
 Commit changed from 2c04029f9a194bd87befa0cd99016e2f267a836f to 8713960717f916d459e0b6a0dcef391dc992d9b2
comment:55 Changed 22 months ago by
 Status changed from needs_work to needs_review
comment:56 in reply to: ↑ 46 Changed 22 months ago by
Replying to SimonKing:
I am giving a positive review to Robert's code and Nils' doctest. I consider my contribution too small to make me author, but I think my changes need a review.
Could someone please finish the review? The patchbot thinks the branch is fine.
comment:57 Changed 22 months ago by
 Description modified (diff)
 Reviewers changed from Simon King to Simon King, Frédéric Chapoton, JeanPierre Flori
 Status changed from needs_review to positive_review
Your change is small and makes sense.
comment:58 followup: ↓ 59 Changed 22 months ago by
 Status changed from positive_review to needs_work
From the patchbot
sage t long src/sage/structure/coerce.pyx ********************************************************************** File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps Failed example: print N2N0 Expected: 0 Got: 1
Is it what we should get?
comment:59 in reply to: ↑ 58 ; followup: ↓ 60 Changed 22 months ago by
 Status changed from needs_work to needs_review
Replying to vdelecroix:
From the patchbot ...
wrong ticket... should be #18905. Sorry for the noise.
comment:60 in reply to: ↑ 59 Changed 22 months ago by
Replying to vdelecroix:
Replying to vdelecroix:
From the patchbot ...
wrong ticket... should be #18905. Sorry for the noise.
The patchbot HERE does report this problem. So, what do you mean that it "should be #18905"?
comment:61 Changed 22 months ago by
 Status changed from needs_review to needs_work
Right! I confused myself.
comment:62 Changed 22 months ago by
The failing doctest is reproducible after applying this branch onto 6.9.beta1
.
comment:63 Changed 22 months ago by
On sage6.9.beta1, I get:
$ sage nbruin.sage Time: CPU 7.95 s, Wall: 7.95 s Time: CPU 7.68 s, Wall: 7.68 s Time: CPU 76.76 s, Wall: 76.94 s
On sage6.9.beta1 + #14058, I get a very good improvement:
$ sage nbruin.sage Time: CPU 2.48 s, Wall: 2.48 s Time: CPU 2.07 s, Wall: 2.07 s Time: CPU 32.38 s, Wall: 32.42 s
Note: I originally thought it was a regression in sage since I was getting this earlier this morning with sage6.8 + #14058. But thanksfully, it seems to be a improvement obtained from this ticket.
The file nbruin.sage
is from comment 4:
$ cat nbruin.sage a=ZZ(1) b=QQ(1) c=ZZ['x'](1) d=b+c def test(x,y): for i in xrange(10^6): _=x+y time test(a,b) time test(a,c) time test(b,c)
comment:64 Changed 22 months ago by
I also get improvements on tests above of Simon King at comments 18.
With sage6.9.beta1:
$ sage testcoercion.sage Time: CPU 1.69 s, Wall: 1.69 s Time: CPU 4.05 s, Wall: 4.05 s C Time: CPU 4.03 s, Wall: 4.03 s
With sage6.9.beta1 + this ticket:
$ sage testcoercion.sage Time: CPU 0.34 s, Wall: 0.34 s Time: CPU 0.85 s, Wall: 0.85 s C Time: CPU 0.86 s, Wall: 0.86 s
(My file testcoercion.sage contains testcoercion.py + the lines at comment 18.)
comment:65 Changed 22 months ago by
With sage6.9.beta1 + this ticket, I get All tests passed
when I do
./sage t long src/sage/structure/coerce.pyx
but I get the problem when I test after the build
./sage bt long src/sage/structure/coerce.pyx
which gives the error:
********************************************************************** File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps Failed example: print N2N0 Expected: 0 Got: 1 **********************************************************************
I have never seen such a behavior.
comment:66 Changed 22 months ago by
 Commit changed from 8713960717f916d459e0b6a0dcef391dc992d9b2 to b73d5372cd889d8b5e72e0ac56a781ba4dde9761
comment:67 Changed 22 months ago by
 Commit changed from b73d5372cd889d8b5e72e0ac56a781ba4dde9761 to 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
e153d08  #14058: Add doctest

b9141ee  Merge branch 'ticket/14058' into develop

90ed181  Trivial fix for a coercion doctest

64b572c  refcount libsingular rings used in plural

9793dbc  Make one test more stable

2c04029  make a test against a memory leak clearer/more stable

e976b79  Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0

8713960  trac #14058 convert some raise statements into py3 syntax

c136ebd  Merge #14058 into 6.9.beta1

13cb2a7  Trac #14058: fix broken doctest

comment:68 followup: ↓ 69 Changed 22 months ago by
In other doctests in the same file, GF(3)
, GF(7)
and GF(41)
are also created. While there are references to GF(7)
and GF(41)
, no variable references to GF(3)
which is available for gargage collection. If GF(3)
gets garbage collected before the offending doctest, then everything is fine, the doctest pass. But it can also be garbage collected in the call to gc.collect()
in the doctest. If this is the case, then N0
was also counting GF(3)
which explains why N2N0
can be negative.
One way to fix this is to call gc.collect()
before creating N0
.
(my first fix was using sets of ids... sorry for the noise).
comment:69 in reply to: ↑ 68 Changed 22 months ago by
Replying to slabbe:
One way to fix this is to call
gc.collect()
before creatingN0
.
I agree, that's the way to fix it.
comment:70 Changed 22 months ago by
 Status changed from needs_work to needs_review
comment:71 Changed 22 months ago by
 Status changed from needs_review to positive_review
I assume Simon's comment gives positive review here as this was the only thing to fix.
comment:72 Changed 22 months ago by
 Reviewers changed from Simon King, Frédéric Chapoton, JeanPierre Flori to Simon King, Frédéric Chapoton, JeanPierre Flori, Sébastien Labbé
comment:73 Changed 22 months ago by
 Milestone changed from sage6.4 to sage6.9
comment:74 Changed 21 months ago by
 Dependencies #12313 deleted
comment:75 Changed 21 months ago by
 Branch changed from public/ticket/14058 to 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d
 Resolution set to fixed
 Status changed from positive_review to closed
comment:76 Changed 20 months ago by
 Commit 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d deleted
On arando:
sage t long src/sage/categories/fields.py ********************************************************************** File "src/sage/categories/fields.py", line 104, in sage.categories.fields.Fields.__contains__ Failed example: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])  n Expected: 0 Got: 1 **********************************************************************
comment:77 Changed 20 months ago by
See #19244
Is it possible to add an example that shows that parents become collectable that previously weren't?