Opened 4 years ago

Closed 21 months ago

Last modified 20 months ago

#14058 closed enhancement (fixed)

Weakly reference binary operation codomains

Reported by: robertwb Owned by: rlm
Priority: major Milestone: sage-6.9
Component: memleak Keywords:
Cc: nbruin, SimonKing, jpflori Merged in:
Authors: Robert Bradshaw, Nils Bruin Reviewers: Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé
Report Upstream: N/A Work issues:
Branch: 13cb2a7 (Commits) Commit:
Dependencies: Stopgaps:

Description (last modified by jpflori)

R(1) + S(1) caches a strong reference to both R and S, which we'd like to avoid.

See discussion at https://groups.google.com/forum/?fromgroups=#!topic/sage-devel/ozhIHIwQ4WA

Attachments (4)

14058-weak-binop.patch (4.8 KB) - added by robertwb 4 years ago.
trac_14058-doctest.patch (1.2 KB) - added by nbruin 4 years ago.
Doctest proposal
14058-weak-binop-morphisms.patch (5.2 KB) - added by robertwb 4 years ago.
testcoercion.py (1.5 KB) - added by SimonKing 4 years ago.
An example that shows that the current patch needs work

Download all attachments as: .zip

Change History (81)

Changed 4 years ago by robertwb

comment:1 Changed 4 years ago by robertwb

  • Cc nbruin SimonKing jpflori added
  • Status changed from new to needs_review

comment:2 Changed 4 years ago by SimonKing

Is it possible to add an example that shows that parents become collectable that previously weren't?

Changed 4 years ago by nbruin

Doctest proposal

comment:3 Changed 4 years ago by nbruin

I like the idea. Can we get some data on speed regressions due to this? In principle the double lookup might be noticeable. I don't mean speed regressions due to parents getting deleted -- those are good to know but need other fixes. I mean when the parents are still around.

Attached doctests may need #12313 to work, in which case this ticket should depend on that (because then that's necessary to get benefit from this ticket).

comment:4 Changed 4 years ago by nbruin

Sadly, the performance hit is quite noticeable.

a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

Within a run, timing test seems fairly consistent. However, between different compiles of the same code I'm observing fluctuations of up to 0.10s in timings. Lucky/unlucky memory layout? Anyway, with patch:

sage: %time test(a,b)
CPU times: user 2.15 s, sys: 0.00 s, total: 2.15 s
Wall time: 2.16 s
sage: %time test(a,c)
CPU times: user 2.25 s, sys: 0.00 s, total: 2.25 s
Wall time: 2.26 s
sage: %time test(b,c)
CPU times: user 16.83 s, sys: 0.00 s, total: 16.83 s

without patch

sage: %time test(a,b)
CPU times: user 1.54 s, sys: 0.00 s, total: 1.54 s
Wall time: 1.55 s
sage: %time test(a,c)
CPU times: user 1.64 s, sys: 0.00 s, total: 1.64 s
Wall time: 1.64 s
CPU times: user 15.68 s, sys: 0.00 s, total: 15.68 s
Wall time: 15.76 s

For test(a,a), test(b,b), test(c,c) there doesn't seem to be a difference (luckily!)

The penalties seem to be about 0.6s, 0.6s, 1.2s, which is rather consistent with 1,1,2 extra lookups.

This indicates to me it may be worth trying storing a weakref to the morphism instead, since that can probably be dereferenced faster than a coercion lookup on the codomain. Logically this should be equivalent. We're just storing a weakref to the object we're interested in rather than instructions where to go and lookup the object we want.

Caveat:

For stored coercions of the type (R,S, ...): (None,<map S->R>) or (R,S, ...): (<map R->S>,None) the codomain is part of the key, so deletion of the codomain (which is a prerequisite for the morphism being deleted and hence the weakref dying) implies removal of the weak key entry in TripleDict.

However, for stored coercions of the type (R,S,...): (<map R->C>,<map S->C>) we could have that C and the weakrefs to the morphisms die, but that would not trigger the removal from the TripleDict. So we could still encounter dead weakrefs (but much less than one would perhaps initially think!) It's not really much different from how 14058-weak-binop.patch handles dead weakrefs.

Major problem for immediately applying this idea:

sage: M=QQ.coerce_map_from(ZZ)
sage: import weakref
sage: weakref.ref(M)
TypeError: cannot create weak reference to 'sage.rings.rational.Z_to_Q' object

comment:5 Changed 4 years ago by robertwb

Yeah, I had exactly this same idea. I'll post an updated patch.

Changed 4 years ago by robertwb

comment:6 Changed 4 years ago by robertwb

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch

comment:7 Changed 4 years ago by nbruin

Great! With this new patch I cannot distinguish the timing differences from the noise one normally gets already, so I'd think this is quite acceptable.

comment:8 Changed 4 years ago by nbruin

  • Dependencies set to #12313

patchbot seems unhappy about the doctest.Probably #12313 is indeed necessary to make parents reclaimable.

comment:9 Changed 4 years ago by SimonKing

  • Description modified (diff)

Stating in the ticket description what patches are to apply. Admittedly I didn't test yet whether the doctest patch applies after Robert's new patch. But the test patch is needed, I think.

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch trac_14058-doctest.patch

comment:10 Changed 4 years ago by SimonKing

  • Description modified (diff)

comment:11 Changed 4 years ago by SimonKing

The patches apply, and at least the new test passes. So, this together with #12313, does fix another leak.

comment:12 Changed 4 years ago by SimonKing

  • Description modified (diff)

comment:13 follow-up: Changed 4 years ago by SimonKing

In vanilla Sage, the coercion model stores (coercion) morphisms in its cache (which was a TripleDict, hence, with weak keys), and the only change introduced by your patch is to store weak references to the (coercion) morphism instead. Did I understand this correctly?

In vanilla Sage, the coercion model kept a strong reference to the morphism, which kept a strong reference to domain and codomain, which were thus not reclaimable, and so the item of the TripleDict has eternal life, in spite of the weak keys. Correct?

With the patch, the coercion model keeps a weak reference to the coercion morphism, hence, the strong reference from the morphism to domain and codomain doesn't matter, and thus the item of the TripleDict may disappear.

But how is the morphism kept alive? The coercion model only has a weak reference to it. Do I understand correctly that the morphism involved in the binary operation is, at the same time, stored in the coerce dict of its codomain? That's why it does not immediately die?

In other words, do I understand that the layout is as follows? We have parents A and B, and want to apply some binary operator, with a result in the parent C (C may coincide with either A or B). And we have maps r_A and r_B from A or B to C.

Both r_A and r_B are stored with a strong reference in a cache located in C, with weak keys A and B. At the same time, they are stored with a weak reference in the coercion model, again with weak keys A and B. r_A has strong references to C and to A, r_B has strong references to C and to B.

What do we want? Do we want that keeping C alive makes A and B survive? Or do we want that keeping both A and B alive makes C survive?

If the user has a strong reference to C, then C has a strong reference to r_A and r_B (in its coerce cache), and r_A/r_B strongly refers to A/B. Hence, the existence of C keeps A and B alive. Since C is a complicated object, it is more likely to be mortal, hence, probably it is not much of a problem.

If the user has a strong reference to both A and B, then C keeps a strong reference to both r_A and r_B, and the coercion model keeps a weak reference to r_A and r_B. Moreover, r_A and r_B strongly refer to C.

However, isn't it just a reference cycle between C and r_A/r_B? It would not prevent C from becoming collectable, right?

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

Or did I misunderstand something?

Last edited 4 years ago by SimonKing (previous) (diff)

Changed 4 years ago by SimonKing

An example that shows that the current patch needs work

comment:14 in reply to: ↑ 13 ; follow-up: Changed 4 years ago by nbruin

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there. Adding elements with the same parents is orders of magnitude faster than relying on coercion. A step closer is ensure that your parent remains in memory. I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

I don't think that we can afford to assume that just because a user has combined two parents he/she will do so again in the future. We may be able to afford doing so if it happened recently, but that really is an optimisation, and currently I don't see a hook where we can do this efficiently.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

The problem with artificially keeping transient elements alive for longer is that you'll defeat the heuristics for generational garbage collection (which Python does), and we depend rather heavily on that to get rid of cyclic garbage. Testing would be required to see if this is indeed a problem, and it will be very application-dependent.

comment:15 Changed 4 years ago by SimonKing

Before posting a reply to Nils' post, here is an example that the existence of A and B does not ensure the survival of the pushout of A and B. First, attach testcoercion.py into a Sage session. Then:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: cm = sage.structure.element.get_coercion_model()
sage: OC = cm.explain(OA,OB)
Coercion on left operand via
    Conversion map:
      From: A
      To:   C
Coercion on right operand via
    Conversion map:
      From: B
      To:   C
Arithmetic performed after coercions.
Result lives in C
sage: OC.is_coercion_cached(OA)
True
sage: OC.is_coercion_cached(OB)
True
sage: del OC
sage: import gc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0
sage: xc = OA(1)*OB(1)
sage: isinstance(xc.parent(),C)
True
sage: del xc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0

I do believe that this is not desired behaviour.

Last edited 4 years ago by SimonKing (previous) (diff)

comment:16 in reply to: ↑ 14 ; follow-up: Changed 4 years ago by SimonKing

Replying to nbruin:

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there.

Granted. But it is a very typical usage to not explicitly convert everything into a common parent, but let the coercion model do the work. That is what the coercion model is for!

I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

However, in the example that I posted above, note that multiplying OA(1)*OB(1) repeatedly will also create C() repeatedly, even though it is a UniqueRepresentation and hence (weakly, nowadays) cached!

I think it is not acceptable that a multiple creation is triggered.

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

OK, but not all unique parent structures rely on it.

comment:17 in reply to: ↑ 16 Changed 4 years ago by nbruin

Replying to SimonKing:

I think it is not acceptable that a multiple creation is triggered.

... Why not? Should the user expect that doing something a second time is faster than the first, without taking special precautions?

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

But how do you tell the difference? Against too little caching there is a simple defense: keep a reference yourself. Against too much caching there is nothing you can do. You just run out of memory because data structures outside your control blow up.

We clearly have too much caching presently, in some cases by design. I think we first need a design that's fairly clean and for which we can reliably reason there's not "too much" caching (changing the order of memory requirement is definitely "too much" for me). Next step is tweaking it, possibly with MRU queues (which in that case should really be investigated for interfering with efficient GC, which is based on "objects either live very short or very long", so artificially creating objects with a medium lifespan is bad)

comment:18 follow-up: Changed 4 years ago by SimonKing

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: a = OA(1)
sage: b = OB(1)
sage: def test(x,y):
....:     for _ in xrange(2*10^5):
....:         z = x*y
....:         
sage: %time test(a,a)
CPU times: user 4.49 s, sys: 0.00 s, total: 4.49 s
Wall time: 4.49 s
sage: %time test(a,b)
CPU times: user 8.99 s, sys: 0.11 s, total: 9.11 s
Wall time: 9.13 s
sage: c = a*b
sage: print c.parent()
C
sage: %time test(a,b)
CPU times: user 8.85 s, sys: 0.03 s, total: 8.88 s
Wall time: 8.89 s

In the first run of test(a,b), there is the possibility that parent(a*b) becomes created repeatedly, as I have demonstrated above. In the second run, parent(a*b) is kept alive, by keeping a pointer to the result of a*b. There is no significant difference in performance. Of course, we also see that multiplication within one parent is faster.

Is there a way to get the patchbots report significant regressions in some examples? Because I am sure that some parts of Sage (schemes and combinat are typical candidates) rely on a strong coercion cache.

comment:19 in reply to: ↑ 18 Changed 4 years ago by nbruin

Replying to SimonKing:

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

Ah right, because C._coerce_from_hash[A]._codomain is C, there's a cycle involving C so instead of getting deleted eagerly by Python's refcounting, it's only done with (relatively infrequent) GCs. So it even comes with a trick to improve this free "caching": Increase the interval for garbage collection. See gc.get_threshold and gc.set_threshold. By tweaking those numbers you can probably manipulate the running times in these examples.

NOTE: Doesn't timeit turn garbage collection off? So that may be a misleading tool for investigating performance for these things.

comment:20 Changed 4 years ago by SimonKing

I think it makes sense to ask sage-devel whether it is better to fix this leak or not.

I believe that keeping A and B alive should prevent C from garbage collection. In contrast, keeping C alive should not prevent A and B from garbage collection. Currently, it is the other way around. Perhaps we should try to think through whether having a weak reference to the domain of coercion or conversion maps would work.

comment:21 Changed 4 years ago by robertwb

+1 to taking this discussion to sage-devel.

Patchbot: apply 14058-weak-binop-morphisms.patch trac_14058-doctest.patch

comment:22 Changed 4 years ago by SimonKing

The patchbot only reports one error, namely

File "/home/patchbot/sage-5.7.beta3/devel/sage-14058/sage/rings/number_field/number_field_ideal.py", line 118:
    sage: convert_to_idealprimedec_form(K, P)
Exception raised:
    Traceback (most recent call last):
      File "/home/patchbot/sage-5.7.beta3/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/patchbot/sage-5.7.beta3/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/patchbot/sage-5.7.beta3/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_2[5]>", line 1, in <module>
        convert_to_idealprimedec_form(K, P)###line 118:
    sage: convert_to_idealprimedec_form(K, P)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/rings/number_field/number_field_ideal.py", line 124, in convert_to_idealprimedec_form
        K_bnf = gp(field.pari_bnf())
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 201, in __call__
        return self._coerce_from_special_method(x)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 227, in _coerce_from_special_method
        return (x.__getattribute__(s))(self)
      File "sage_object.pyx", line 485, in sage.structure.sage_object.SageObject._gp_ (sage/structure/sage_object.c:4804)
      File "sage_object.pyx", line 450, in sage.structure.sage_object.SageObject._interface_ (sage/structure/sage_object.c:4144)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 199, in __call__
        return cls(self, x, name=name)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/expect.py", line 1280, in __init__
        self._name = parent._create(value, name=name)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 389, in _create
        self.set(name, value)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/gp.py", line 511, in set
        raise TypeError, "Error executing code in GP:\nCODE:\n\t%s\nPARI/GP ERROR:\n%s"%(cmd, out)
    TypeError: Error executing code in GP:
    CODE:
    	sage[14]=[[;], matrix(0,7), [;], Mat([0.E-38 + 3.52184386028273*I, 0.E-37 + 3.70366245659542*I, 0.E-37 + 2.91167081258838*I, 0.E-38 + 3.14159265358979*I, 0.E-38 + 9.04452675407645*I, 0.E-37 + 8.86270815776375*I, 0.E-37 + 9.65469980177079*I]), [[7, [-1, 2]~, 1, 1, [3, -2; 2, 1]], [13, [-5, 2]~, 1, 1, [-6, -2; 2, -8]], [19, [-3, 2]~, 1, 1, [5, -2; 2, 3]], [3, [1, 2]~, 2, 1, [1, 1; -1, 2]], [7, [3, 2]~, 1, 1, [-1, -2; 2, -3]], [13, [7, 2]~, 1, 1, [-5, -2; 2, -7]], [19, [5, 2]~, 1, 1, [-3, -2; 2, -5]]], 0, [y^2 + 3, [0, 1], -3, 2, [Mat([1, -0.500000000000000 - 0.866025403784439*I]), [1, -1.36602540378444; 1, 0.366025403784439], [1, -1; 1, 0], [2, -1; -1, -1], [3, 2; 0, 1], [1, -1; -1, -2], [3, [2, -1; 1, 1]]], [0.E-38 - 1.73205080756888*I], [1, 1/2*y - 1/2], [1, 1; 0, 2], [1, 0, 0, -1; 0, 1, 1, -1]], [[1, [], []], 1, 1, [6, -1/2*y + 1/2], []], [[;], [], []], [0, []]];
    PARI/GP ERROR:
      ***   at top-level: sage[14]=[[;],matrix(0,7
      ***                     ^--------------------
      ***   _[_]: not a vector.
  ***   Warning: precision too low for generators, not given.

That's obscure to me. What does it mean?

comment:23 Changed 4 years ago by SimonKing

Unfortunately, I get

sage -t  "devel/sage-main/sage/rings/number_field/number_field_ideal.py"
         [21.9 s]
 
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 22.0 seconds

So, what happened here?

In any case, I think the example of collectable pushout should be included in the documentation, combined with a warning to keep a pointer to the pushout (or avoid cross-parent arithmetic altogether), if speed matters.

I could write that. But to what location? Logically, it belongs to sage.structure.coerce, but nobody would read that. It might better be in a more general tutorial. What do you suggest?

comment:24 Changed 4 years ago by SimonKing

I did make ptest with a highly patched debug version of sage-5.7.beta2. Hence, I have the dependencies of #13864 and additionally #9107, #12951, #13916. #12313, #14058 and #13370 applied.

In the parallel test, I got two timeouts, namely in devel/sage/doc/en/bordeaux_2008/birds_other.rst and devel/sage/sage/interfaces/maxima.py. Running the former individually is long, but fine:

sage -t -force_lib "devel/sage/doc/en/bordeaux_2008/birds_other.rst"
         [339.8 s]
 
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 339.9 seconds

However, the maxima test is suspicious. I ran it twice individually. The first time I got an evil error:

sage -t -force_lib "devel/sage/sage/interfaces/maxima.py"   
**********************************************************************
File "/home/simon/SAGE/debug/sage-5.7.beta2/devel/sage/sage/interfaces/maxima.py", line 355:
    sage: T.tlimit('n','inf')
Exception raised:
    Traceback (most recent call last):
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_0[77]>", line 1, in <module>
        T.tlimit('n','inf')###line 355:
    sage: T.tlimit('n','inf')
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 588, in __call__
        return self._obj.parent().function_call(self._name, [self._obj] + list(args), kwds)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 489, in function_call
        return self.new(s)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 264, in new
        return self(code)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 199, in __call__
        return cls(self, x, name=name)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/maxima.py", line 1153, in __init__
        ExpectElement.__init__(self, parent, value, is_name=False, name=None)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/expect.py", line 1280, in __init__
        self._name = parent._create(value, name=name)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 389, in _create
        self.set(name, value)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/maxima.py", line 998, in set
        self._eval_line(cmd)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/maxima.py", line 754, in _eval_line
        assert line_echo.strip() == line.strip()
    AssertionError
**********************************************************************
1 items had failures:
   1 of  97 in __main__.example_0
***Test Failed*** 1 failures.
For whitespace errors, see the file /home/simon/.sage//tmp/maxima_30383.py
         [13.7 s]
 
----------------------------------------------------------------------
The following tests failed:


        sage -t -force_lib "devel/sage/sage/interfaces/maxima.py"
Total time for all tests: 13.8 seconds

and the second time I got

sage -t -force_lib "devel/sage/sage/interfaces/maxima.py"   
         [12.6 s]
 
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 12.6 seconds

Is the error something we should worry about? Or is that likely to be the effect of a neutrino hitting my laptop?

comment:25 Changed 4 years ago by SimonKing

At #14159, I am providing a version of MonoDict and TripleDict that optionally allows to store the values by weak references. Provided that the performance is OK, Nils Bruin suggests to consider to use this here, hence, make #14159 a dependency. What do you think?

Last edited 4 years ago by SimonKing (previous) (diff)

comment:26 Changed 4 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:27 Changed 4 years ago by SimonKing

Note to myself: Work on this ticket...

comment:28 Changed 4 years ago by SimonKing

Lost track again.

I tried to import the patches and turn them into git branches. It mostly worked, but I had to resolve some conflicts with the second patch.

Would it be OK to change this hg-ticket into a git-ticket?

comment:29 Changed 4 years ago by SimonKing

  • Reviewers set to Simon King
  • Status changed from needs_review to needs_work
  • Work issues set to Fix doctests; decide between hg and git

While I had to resolve conflicts when importing the two patches into the git master branch, the patches do apply to sage-5.12.beta5, and the patchbot reports the same on sage-5.13.beta2. So, perhaps it is OK to keep this a mercurial ticket.

However, with the gitification of the patches, I find (similar to the patchbot)

sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/rings/polynomial/plural.pyx  # 17 doctests failed
sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/libs/singular/function.pyx  # 1 doctest failed

The patchbot additionally finds

sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/algebras/free_algebra.py  # 1 doctest failed
sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/structure/coerce.pyx  # 1 doctest failed

So, this requires some work.

comment:30 Changed 3 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:31 Changed 3 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:32 Changed 3 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:33 Changed 23 months ago by SimonKing

I plan to create a branch for it. Probably it will fix #18905.

comment:34 Changed 23 months ago by SimonKing

  • Branch set to u/SimonKing/weakly_reference_binary_operation_codomains

comment:35 Changed 23 months ago by SimonKing

  • Commit set to b9141eef1042c8b8fb1179b30844d5bdb737b96e
  • Work issues changed from Fix doctests; decide between hg and git to Fix doctests

Here is a branch. The merge was not trivial, so, hopefully I did it well.

Next, we have to see if the "old problem" (i.e., attachment:testcoercion.py) is still problematic, and of course there may be doctest failures.

At least, the issue from comment:20 is solved: We do have weak references to the domain of coercion maps now.


New commits:

ac68d9bRemove strong references to parents used in binary operations in the coercion model.
e153d08#14058: Add doctest
b9141eeMerge branch 'ticket/14058' into develop

comment:36 Changed 23 months ago by SimonKing

sage -t src/sage/algebras/commutative_dga.py  # 3 doctests failed
sage -t src/sage/algebras/free_algebra.py  # 1 doctest failed
sage -t src/sage/rings/polynomial/plural.pyx  # 17 doctests failed
sage -t src/sage/dev/sagedev.py  # 2 doctests failed
sage -t src/sage/structure/coerce.pyx  # 2 doctests failed
sage -t src/sage/libs/singular/function.pyx  # 2 doctests failed

Not bad (and the sagedev error very likely is because I have rerere enabled, hence, it is unrelated anyway).

Last edited 23 months ago by SimonKing (previous) (diff)

comment:37 Changed 23 months ago by SimonKing

Details:

sage -t src/sage/libs/singular/function.pyx
**********************************************************************
File "src/sage/libs/singular/function.pyx", line 413, in sage.libs.singular.function.is_singular_poly_wrapper
Failed example:
    H.<x,y,z> = A.g_algebra({z*x:x*z+2*x, z*y:y*z-2*y})
Expected nothing
Got:
    Exception KeyError: (The ring pointer 0x7f7c7470c8d0,) in 'sage.libs.singular.ring.singular_ring_delete' ignored
**********************************************************************
File "src/sage/libs/singular/function.pyx", line 1250, in sage.libs.singular.function.SingularFunction.__call__
Failed example:
    I = Ideal(I.groebner_basis())
Expected nothing
Got:
    Exception KeyError: (The ring pointer 0x7f7c7470ca40,) in 'sage.libs.singular.ring.singular_ring_delete' ignored
**********************************************************************

--> It seems that the cache became too weak. All errors in sage -t src/sage/rings/polynomial/plural.pyx and sage -t src/sage/algebras/commutative_dga.py are of that form.

sage -t src/sage/structure/coerce.pyx
**********************************************************************
File "src/sage/structure/coerce.pyx", line 418, in sage.structure.coerce.CoercionModel_cache_maps.get_cache
Failed example:
    print copy(left_morphism_ref())
Exception raised:
    Traceback (most recent call last):
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.structure.coerce.CoercionModel_cache_maps.get_cache[4]>", line 1, in <module>
        print copy(left_morphism_ref())
    NameError: name 'left_morphism_ref' is not defined
**********************************************************************
File "src/sage/structure/coerce.pyx", line 422, in sage.structure.coerce.CoercionModel_cache_maps.get_cache
Failed example:
    print right_morphism_ref
Exception raised:
    Traceback (most recent call last):
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.structure.coerce.CoercionModel_cache_maps.get_cache[5]>", line 1, in <module>
        print right_morphism_ref
    NameError: name 'right_morphism_ref' is not defined
**********************************************************************

--> I got something wrong in my merge.

So, we actually just have two errors.

Last edited 23 months ago by SimonKing (previous) (diff)

comment:38 Changed 22 months ago by git

  • Commit changed from b9141eef1042c8b8fb1179b30844d5bdb737b96e to 90ed181ae9df152e99a652636a7e9dc29b466916

Branch pushed to git repo; I updated commit sha1. New commits:

90ed181Trivial fix for a coercion doctest

comment:39 Changed 22 months ago by SimonKing

#13447 is probably related. Perhaps (if we can finish the work here) #13447 can eventually be closed as a duplicate.

comment:40 Changed 22 months ago by SimonKing

In the following example

            sage: import gc
            sage: from sage.rings.polynomial.plural import NCPolynomialRing_plural
            sage: from sage.algebras.free_algebra import FreeAlgebra
            sage: A1.<x,y,z> = FreeAlgebra(QQ, 3)
            sage: R1 = A1.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: A2.<x,y,z> = FreeAlgebra(GF(5), 3)
            sage: R2 = A2.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: A3.<x,y,z> = FreeAlgebra(GF(11), 3)
            sage: R3 = A3.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: A4.<x,y,z> = FreeAlgebra(GF(13), 3)
            sage: R4 = A4.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: _ = gc.collect()
            sage: foo = R1.gen(0)
            sage: del foo
            sage: del R1
            sage: _ = gc.collect()
            sage: del R2
            sage: _ = gc.collect()
            sage: del R3
            sage: _ = gc.collect()

the reference counting goes wrong after each gc.collect().

comment:41 Changed 22 months ago by git

  • Commit changed from 90ed181ae9df152e99a652636a7e9dc29b466916 to 64b572ccf6403be0c617d0e39de4b1e6cd5dbc58

Branch pushed to git repo; I updated commit sha1. New commits:

64b572crefcount libsingular rings used in plural

comment:42 Changed 22 months ago by SimonKing

Got it! When a non-commutative ring is created, it was forgotten to put the underlying libsingular ring into Sage's refcounting system. Should be fixed now. I'll see if #13447 is fixed now, too.

comment:43 Changed 22 months ago by SimonKing

  • Authors changed from Robert Bradshaw to Robert Bradshaw, Nils Bruin
  • Status changed from needs_work to needs_review
  • Work issues Fix doctests deleted

Sigh. #13447 is not fixed, as the following example shows.

sage: import gc
sage: _ = gc.collect()
sage: from sage.rings.polynomial.multi_polynomial_libsingular import MPolynomialRing_libsingular
sage: from sage.libs.singular.groebner_strategy import GroebnerStrategy
sage: n = len([x for x in gc.get_objects() if isinstance(x,MPolynomialRing_libsingular)])
sage: P = MPolynomialRing_libsingular(GF(541), 2, ('x', 'y'), TermOrder('degrevlex', 2))
sage: strat = GroebnerStrategy(Ideal([P.gen(0) + P.gen(1)]))
sage: del strat
sage: del P
sage: _ = gc.collect()
sage: n == len([x for x in gc.get_objects() if isinstance(x,MPolynomialRing_libsingular)])
False

I'd suggest to proceed as follows. Provided that all tests pass, I can review Robert's and Nils' patches, someone else reviews my additions (which are the merging and the latest commit), and then we make this a dependency for #13447.

Last edited 22 months ago by SimonKing (previous) (diff)

comment:44 Changed 22 months ago by git

  • Commit changed from 64b572ccf6403be0c617d0e39de4b1e6cd5dbc58 to 9793dbc9bc64bfdce43ff1e626453816c302696d

Branch pushed to git repo; I updated commit sha1. New commits:

9793dbcMake one test more stable

comment:45 follow-up: Changed 22 months ago by SimonKing

Sometimes, the following test in sage/categories/fields.py has failed:

            sage: import gc
            sage: _ = gc.collect()
            sage: n = len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
            sage: for i in prime_range(100):
            ....:     R = ZZ.quotient(i)
            ....:     t = R in Fields()
            sage: _ = gc.collect()
            sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
            1

The reason for the failure: Sometimes the last line gave 0, not 1. It is not totally clear to me why the different results occurred. But in fact 0 is better than 1 here (one additional ring got collected). So, I hope my change is OK.

comment:46 follow-up: Changed 22 months ago by SimonKing

I am giving a positive review to Robert's code and Nils' doctest. I consider my contribution too small to make me author, but I think my changes need a review.

comment:47 in reply to: ↑ 45 ; follow-up: Changed 22 months ago by pbruin

Replying to SimonKing:

Sometimes, the following test in sage/categories/fields.py has failed:

            sage: import gc
            sage: _ = gc.collect()
            sage: n = len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
            sage: for i in prime_range(100):
            ....:     R = ZZ.quotient(i)
            ....:     t = R in Fields()
            sage: _ = gc.collect()
            sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
            1

The reason for the failure: Sometimes the last line gave 0, not 1. It is not totally clear to me why the different results occurred. But in fact 0 is better than 1 here (one additional ring got collected).

Shouldn't the result always be 1 given that R refers to Zmod(97) after the loop (unless there is a surviving reference from a Zmod(97) created in some earlier test)? What happens if you do del R before the last gc.collect()?

comment:48 in reply to: ↑ 47 Changed 22 months ago by SimonKing

Replying to pbruin:

Shouldn't the result always be 1 given that R refers to Zmod(97) after the loop (unless there is a surviving reference from a Zmod(97) created in some earlier test)?

Exactly. And what's strange: It isn't reproducible. Sometimes I get 1, sometimes I get 0. And I always get 1 when I run it on the command line.

What happens if you do del R before the last gc.collect()?

No idea.

comment:49 Changed 22 months ago by SimonKing

To be precise: I saw this problem precisely once. When I run the previous test about 20 times, it always passes.

comment:50 Changed 22 months ago by git

  • Commit changed from 9793dbc9bc64bfdce43ff1e626453816c302696d to 2c04029f9a194bd87befa0cd99016e2f267a836f

Branch pushed to git repo; I updated commit sha1. New commits:

2c04029make a test against a memory leak clearer/more stable

comment:51 Changed 22 months ago by SimonKing

Hopefully the new version of the test is reliable.

comment:52 follow-up: Changed 22 months ago by chapoton

  • Status changed from needs_review to needs_work

excerpt from patchbot report:

+inside file:  b/src/sage/structure/coerce.pyx
+@@ -1265,33 +1287,74 @@ cdef class CoercionModel_cache_maps(CoercionModel):
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
+Old-style raise statement inserted on 2 non-empty lines

comment:53 in reply to: ↑ 52 Changed 22 months ago by SimonKing

Replying to chapoton:

excerpt from patchbot report:

+inside file:  b/src/sage/structure/coerce.pyx
+@@ -1265,33 +1287,74 @@ cdef class CoercionModel_cache_maps(CoercionModel):
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
+Old-style raise statement inserted on 2 non-empty lines

Actually this was not inserted, but moved. Would it be possible for you to change this to new-style raise statement by a reviewer commit?

comment:54 Changed 22 months ago by chapoton

  • Branch changed from u/SimonKing/weakly_reference_binary_operation_codomains to public/ticket/14058
  • Commit changed from 2c04029f9a194bd87befa0cd99016e2f267a836f to 8713960717f916d459e0b6a0dcef391dc992d9b2

done. I also did it for some of the other raise in the same file.


New commits:

e976b79Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0
8713960trac #14058 convert some raise statements into py3 syntax

comment:55 Changed 22 months ago by chapoton

  • Status changed from needs_work to needs_review

comment:56 in reply to: ↑ 46 Changed 22 months ago by SimonKing

Replying to SimonKing:

I am giving a positive review to Robert's code and Nils' doctest. I consider my contribution too small to make me author, but I think my changes need a review.

Could someone please finish the review? The patchbot thinks the branch is fine.

comment:57 Changed 22 months ago by jpflori

  • Description modified (diff)
  • Reviewers changed from Simon King to Simon King, Frédéric Chapoton, Jean-Pierre Flori
  • Status changed from needs_review to positive_review

Your change is small and makes sense.

comment:58 follow-up: Changed 22 months ago by vdelecroix

  • Status changed from positive_review to needs_work

From the patchbot

sage -t --long src/sage/structure/coerce.pyx
**********************************************************************
File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps
Failed example:
    print N2-N0
Expected:
    0
Got:
    -1

Is it what we should get?

comment:59 in reply to: ↑ 58 ; follow-up: Changed 22 months ago by vdelecroix

  • Status changed from needs_work to needs_review

Replying to vdelecroix:

From the patchbot ...

wrong ticket... should be #18905. Sorry for the noise.

comment:60 in reply to: ↑ 59 Changed 22 months ago by SimonKing

Replying to vdelecroix:

Replying to vdelecroix:

From the patchbot ...

wrong ticket... should be #18905. Sorry for the noise.

The patchbot HERE does report this problem. So, what do you mean that it "should be #18905"?

comment:61 Changed 22 months ago by vdelecroix

  • Status changed from needs_review to needs_work

Right! I confused myself.

comment:62 Changed 22 months ago by vdelecroix

The failing doctest is reproducible after applying this branch onto 6.9.beta1.

comment:63 Changed 22 months ago by slabbe

On sage-6.9.beta1, I get:

$ sage nbruin.sage
Time: CPU 7.95 s, Wall: 7.95 s
Time: CPU 7.68 s, Wall: 7.68 s
Time: CPU 76.76 s, Wall: 76.94 s

On sage-6.9.beta1 + #14058, I get a very good improvement:

$ sage nbruin.sage
Time: CPU 2.48 s, Wall: 2.48 s
Time: CPU 2.07 s, Wall: 2.07 s
Time: CPU 32.38 s, Wall: 32.42 s

Note: I originally thought it was a regression in sage since I was getting this earlier this morning with sage-6.8 + #14058. But thanksfully, it seems to be a improvement obtained from this ticket.

The file nbruin.sage is from comment 4:

 $ cat nbruin.sage
a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

time test(a,b)
time test(a,c)
time test(b,c)

comment:64 Changed 22 months ago by slabbe

I also get improvements on tests above of Simon King at comments 18.

With sage-6.9.beta1:

$ sage testcoercion.sage
Time: CPU 1.69 s, Wall: 1.69 s
Time: CPU 4.05 s, Wall: 4.05 s
C
Time: CPU 4.03 s, Wall: 4.03 s

With sage-6.9.beta1 + this ticket:

$ sage testcoercion.sage
Time: CPU 0.34 s, Wall: 0.34 s
Time: CPU 0.85 s, Wall: 0.85 s
C
Time: CPU 0.86 s, Wall: 0.86 s

(My file testcoercion.sage contains testcoercion.py + the lines at comment 18.)

comment:65 Changed 22 months ago by slabbe

With sage-6.9.beta1 + this ticket, I get All tests passed when I do

./sage -t --long src/sage/structure/coerce.pyx

but I get the problem when I test after the build

./sage -bt --long src/sage/structure/coerce.pyx

which gives the error:

**********************************************************************
File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps
Failed example:
    print N2-N0
Expected:
    0
Got:
    -1
**********************************************************************

I have never seen such a behavior.

comment:66 Changed 22 months ago by git

  • Commit changed from 8713960717f916d459e0b6a0dcef391dc992d9b2 to b73d5372cd889d8b5e72e0ac56a781ba4dde9761

Branch pushed to git repo; I updated commit sha1. New commits:

c136ebdMerge #14058 into 6.9.beta1
b73d537Trac #14058: fix broken doctest

comment:67 Changed 22 months ago by git

  • Commit changed from b73d5372cd889d8b5e72e0ac56a781ba4dde9761 to 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

e153d08#14058: Add doctest
b9141eeMerge branch 'ticket/14058' into develop
90ed181Trivial fix for a coercion doctest
64b572crefcount libsingular rings used in plural
9793dbcMake one test more stable
2c04029make a test against a memory leak clearer/more stable
e976b79Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0
8713960trac #14058 convert some raise statements into py3 syntax
c136ebdMerge #14058 into 6.9.beta1
13cb2a7Trac #14058: fix broken doctest

comment:68 follow-up: Changed 22 months ago by slabbe

In other doctests in the same file, GF(3), GF(7) and GF(41) are also created. While there are references to GF(7) and GF(41), no variable references to GF(3) which is available for gargage collection. If GF(3) gets garbage collected before the offending doctest, then everything is fine, the doctest pass. But it can also be garbage collected in the call to gc.collect() in the doctest. If this is the case, then N0 was also counting GF(3) which explains why N2-N0 can be negative.

One way to fix this is to call gc.collect() before creating N0.

(my first fix was using sets of ids... sorry for the noise).

comment:69 in reply to: ↑ 68 Changed 22 months ago by SimonKing

Replying to slabbe:

One way to fix this is to call gc.collect() before creating N0.

I agree, that's the way to fix it.

comment:70 Changed 22 months ago by slabbe

  • Status changed from needs_work to needs_review

comment:71 Changed 22 months ago by jpflori

  • Status changed from needs_review to positive_review

I assume Simon's comment gives positive review here as this was the only thing to fix.

comment:72 Changed 22 months ago by jpflori

  • Reviewers changed from Simon King, Frédéric Chapoton, Jean-Pierre Flori to Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé

comment:73 Changed 22 months ago by chapoton

  • Milestone changed from sage-6.4 to sage-6.9

comment:74 Changed 21 months ago by vbraun

  • Dependencies #12313 deleted

comment:75 Changed 21 months ago by vbraun

  • Branch changed from public/ticket/14058 to 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:76 Changed 20 months ago by jdemeyer

  • Commit 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d deleted

On arando:

sage -t --long src/sage/categories/fields.py
**********************************************************************
File "src/sage/categories/fields.py", line 104, in sage.categories.fields.Fields.__contains__
Failed example:
    len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
Expected:
    0
Got:
    -1
**********************************************************************

comment:77 Changed 20 months ago by jdemeyer

See #19244

Note: See TracTickets for help on using tickets.