Opened 10 years ago

Closed 7 years ago

Last modified 7 years ago

#14058 closed enhancement (fixed)

Weakly reference binary operation codomains

Reported by: Robert Bradshaw Owned by: Robert Miller
Priority: major Milestone: sage-6.9
Component: memleak Keywords:
Cc: Nils Bruin, Simon King, Jean-Pierre Flori Merged in:
Authors: Robert Bradshaw, Nils Bruin Reviewers: Simon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé
Report Upstream: N/A Work issues:
Branch: 13cb2a7 (Commits, GitHub, GitLab) Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by Jean-Pierre Flori)

R(1) + S(1) caches a strong reference to both R and S, which we'd like to avoid.

See discussion at https://groups.google.com/forum/?fromgroups=#!topic/sage-devel/ozhIHIwQ4WA

Attachments (4)

14058-weak-binop.patch (4.8 KB) - added by Robert Bradshaw 10 years ago.
trac_14058-doctest.patch (1.2 KB) - added by Nils Bruin 10 years ago.
Doctest proposal
14058-weak-binop-morphisms.patch (5.2 KB) - added by Robert Bradshaw 10 years ago.
testcoercion.py (1.5 KB) - added by Simon King 10 years ago.
An example that shows that the current patch needs work

Download all attachments as: .zip

Change History (81)

Changed 10 years ago by Robert Bradshaw

Attachment: 14058-weak-binop.patch added

comment:1 Changed 10 years ago by Robert Bradshaw

Cc: Nils Bruin Simon King Jean-Pierre Flori added
Status: newneeds_review

comment:2 Changed 10 years ago by Simon King

Is it possible to add an example that shows that parents become collectable that previously weren't?

Changed 10 years ago by Nils Bruin

Attachment: trac_14058-doctest.patch added

Doctest proposal

comment:3 Changed 10 years ago by Nils Bruin

I like the idea. Can we get some data on speed regressions due to this? In principle the double lookup might be noticeable. I don't mean speed regressions due to parents getting deleted -- those are good to know but need other fixes. I mean when the parents are still around.

Attached doctests may need #12313 to work, in which case this ticket should depend on that (because then that's necessary to get benefit from this ticket).

comment:4 Changed 10 years ago by Nils Bruin

Sadly, the performance hit is quite noticeable.

a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

Within a run, timing test seems fairly consistent. However, between different compiles of the same code I'm observing fluctuations of up to 0.10s in timings. Lucky/unlucky memory layout? Anyway, with patch:

sage: %time test(a,b)
CPU times: user 2.15 s, sys: 0.00 s, total: 2.15 s
Wall time: 2.16 s
sage: %time test(a,c)
CPU times: user 2.25 s, sys: 0.00 s, total: 2.25 s
Wall time: 2.26 s
sage: %time test(b,c)
CPU times: user 16.83 s, sys: 0.00 s, total: 16.83 s

without patch

sage: %time test(a,b)
CPU times: user 1.54 s, sys: 0.00 s, total: 1.54 s
Wall time: 1.55 s
sage: %time test(a,c)
CPU times: user 1.64 s, sys: 0.00 s, total: 1.64 s
Wall time: 1.64 s
CPU times: user 15.68 s, sys: 0.00 s, total: 15.68 s
Wall time: 15.76 s

For test(a,a), test(b,b), test(c,c) there doesn't seem to be a difference (luckily!)

The penalties seem to be about 0.6s, 0.6s, 1.2s, which is rather consistent with 1,1,2 extra lookups.

This indicates to me it may be worth trying storing a weakref to the morphism instead, since that can probably be dereferenced faster than a coercion lookup on the codomain. Logically this should be equivalent. We're just storing a weakref to the object we're interested in rather than instructions where to go and lookup the object we want.

Caveat:

For stored coercions of the type (R,S, ...): (None,<map S->R>) or (R,S, ...): (<map R->S>,None) the codomain is part of the key, so deletion of the codomain (which is a prerequisite for the morphism being deleted and hence the weakref dying) implies removal of the weak key entry in TripleDict.

However, for stored coercions of the type (R,S,...): (<map R->C>,<map S->C>) we could have that C and the weakrefs to the morphisms die, but that would not trigger the removal from the TripleDict. So we could still encounter dead weakrefs (but much less than one would perhaps initially think!) It's not really much different from how 14058-weak-binop.patch handles dead weakrefs.

Major problem for immediately applying this idea:

sage: M=QQ.coerce_map_from(ZZ)
sage: import weakref
sage: weakref.ref(M)
TypeError: cannot create weak reference to 'sage.rings.rational.Z_to_Q' object

comment:5 Changed 10 years ago by Robert Bradshaw

Yeah, I had exactly this same idea. I'll post an updated patch.

Changed 10 years ago by Robert Bradshaw

comment:6 Changed 10 years ago by Robert Bradshaw

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch

comment:7 Changed 10 years ago by Nils Bruin

Great! With this new patch I cannot distinguish the timing differences from the noise one normally gets already, so I'd think this is quite acceptable.

comment:8 Changed 10 years ago by Nils Bruin

Dependencies: #12313

patchbot seems unhappy about the doctest.Probably #12313 is indeed necessary to make parents reclaimable.

comment:9 Changed 10 years ago by Simon King

Description: modified (diff)

Stating in the ticket description what patches are to apply. Admittedly I didn't test yet whether the doctest patch applies after Robert's new patch. But the test patch is needed, I think.

Apply 14058-weak-binop-morphisms.patch and trac_14058-doctest.patch trac_14058-doctest.patch

comment:10 Changed 10 years ago by Simon King

Description: modified (diff)

comment:11 Changed 10 years ago by Simon King

The patches apply, and at least the new test passes. So, this together with #12313, does fix another leak.

comment:12 Changed 10 years ago by Simon King

Description: modified (diff)

comment:13 Changed 10 years ago by Simon King

In vanilla Sage, the coercion model stores (coercion) morphisms in its cache (which was a TripleDict, hence, with weak keys), and the only change introduced by your patch is to store weak references to the (coercion) morphism instead. Did I understand this correctly?

In vanilla Sage, the coercion model kept a strong reference to the morphism, which kept a strong reference to domain and codomain, which were thus not reclaimable, and so the item of the TripleDict has eternal life, in spite of the weak keys. Correct?

With the patch, the coercion model keeps a weak reference to the coercion morphism, hence, the strong reference from the morphism to domain and codomain doesn't matter, and thus the item of the TripleDict may disappear.

But how is the morphism kept alive? The coercion model only has a weak reference to it. Do I understand correctly that the morphism involved in the binary operation is, at the same time, stored in the coerce dict of its codomain? That's why it does not immediately die?

In other words, do I understand that the layout is as follows? We have parents A and B, and want to apply some binary operator, with a result in the parent C (C may coincide with either A or B). And we have maps r_A and r_B from A or B to C.

Both r_A and r_B are stored with a strong reference in a cache located in C, with weak keys A and B. At the same time, they are stored with a weak reference in the coercion model, again with weak keys A and B. r_A has strong references to C and to A, r_B has strong references to C and to B.

What do we want? Do we want that keeping C alive makes A and B survive? Or do we want that keeping both A and B alive makes C survive?

If the user has a strong reference to C, then C has a strong reference to r_A and r_B (in its coerce cache), and r_A/r_B strongly refers to A/B. Hence, the existence of C keeps A and B alive. Since C is a complicated object, it is more likely to be mortal, hence, probably it is not much of a problem.

If the user has a strong reference to both A and B, then C keeps a strong reference to both r_A and r_B, and the coercion model keeps a weak reference to r_A and r_B. Moreover, r_A and r_B strongly refer to C.

However, isn't it just a reference cycle between C and r_A/r_B? It would not prevent C from becoming collectable, right?

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

Or did I misunderstand something?

Last edited 10 years ago by Simon King (previous) (diff)

Changed 10 years ago by Simon King

Attachment: testcoercion.py added

An example that shows that the current patch needs work

comment:14 in reply to:  13 ; Changed 10 years ago by Nils Bruin

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there. Adding elements with the same parents is orders of magnitude faster than relying on coercion. A step closer is ensure that your parent remains in memory. I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

I don't think that we can afford to assume that just because a user has combined two parents he/she will do so again in the future. We may be able to afford doing so if it happened recently, but that really is an optimisation, and currently I don't see a hook where we can do this efficiently.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

The problem with artificially keeping transient elements alive for longer is that you'll defeat the heuristics for generational garbage collection (which Python does), and we depend rather heavily on that to get rid of cyclic garbage. Testing would be required to see if this is indeed a problem, and it will be very application-dependent.

comment:15 Changed 10 years ago by Simon King

Before posting a reply to Nils' post, here is an example that the existence of A and B does not ensure the survival of the pushout of A and B. First, attach testcoercion.py into a Sage session. Then:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: cm = sage.structure.element.get_coercion_model()
sage: OC = cm.explain(OA,OB)
Coercion on left operand via
    Conversion map:
      From: A
      To:   C
Coercion on right operand via
    Conversion map:
      From: B
      To:   C
Arithmetic performed after coercions.
Result lives in C
sage: OC.is_coercion_cached(OA)
True
sage: OC.is_coercion_cached(OB)
True
sage: del OC
sage: import gc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0
sage: xc = OA(1)*OB(1)
sage: isinstance(xc.parent(),C)
True
sage: del xc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X,C)])
0

I do believe that this is not desired behaviour.

Last edited 10 years ago by Simon King (previous) (diff)

comment:16 in reply to:  14 ; Changed 10 years ago by Simon King

Replying to nbruin:

I doubt that we want that. It is not desirable that, when adding elements of ZZ['x'] and QQ, the ring QQ['x'] is created repeatedly (OK, polynomial rings have a strong cache anyway. But you see what I mean).

In my opinion, that is exactly what we want, or at least what we have to settle for. If a person wants to work efficiently with QQ['x'] he should ensure his elements already live there.

Granted. But it is a very typical usage to not explicitly convert everything into a common parent, but let the coercion model do the work. That is what the coercion model is for!

I suspect that usually that will be the case, because if a user creates elements somewhere regularly he probably

However, in the example that I posted above, note that multiplying OA(1)*OB(1) repeatedly will also create C() repeatedly, even though it is a UniqueRepresentation and hence (weakly, nowadays) cached!

I think it is not acceptable that a multiple creation is triggered.

In this scenario:

P=[ZZ['x%d'%i] for i in [1..1000]]
F=[GF(p) for p in prime_range(1000)]
for Zxi in P:
    for k in F:
        a=P.0
        b=F(1)
        c=a+b

If we keep the GF(p)[xi] alive because at some point we constructed an element in it from parents that are still around, we've just created quadratic memory requirements where linear should be enough.

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

A limited length buffer that gets updated whenever a certain component sees a parent go by might work. In Coercion discovery perhaps? That's already slow and it would exactly protect these automatically generated parents. Logically it's hard to defend but it may be OK in practice.

Another possibility is just UniqueRepresentation_c.__cinit__. Only when we're making a lot of parents do we care about losing them again ...

OK, but not all unique parent structures rely on it.

comment:17 in reply to:  16 Changed 10 years ago by Nils Bruin

Replying to SimonKing:

I think it is not acceptable that a multiple creation is triggered.

... Why not? Should the user expect that doing something a second time is faster than the first, without taking special precautions?

In your example, GF(p)[xi] is created exactly once, for each value of p and i. Clearly, in this situation it would not make sense to keep it alive, because it is never reused. However, I doubt that this is a typical use case.

But how do you tell the difference? Against too little caching there is a simple defense: keep a reference yourself. Against too much caching there is nothing you can do. You just run out of memory because data structures outside your control blow up.

We clearly have too much caching presently, in some cases by design. I think we first need a design that's fairly clean and for which we can reliably reason there's not "too much" caching (changing the order of memory requirement is definitely "too much" for me). Next step is tweaking it, possibly with MRU queues (which in that case should really be investigated for interfering with efficient GC, which is based on "objects either live very short or very long", so artificially creating objects with a medium lifespan is bad)

comment:18 Changed 10 years ago by Simon King

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

sage: %attach /home/simon/SAGE/work/memleak/testcoercion.py
sage: OA = A()
sage: OB = B()
sage: a = OA(1)
sage: b = OB(1)
sage: def test(x,y):
....:     for _ in xrange(2*10^5):
....:         z = x*y
....:         
sage: %time test(a,a)
CPU times: user 4.49 s, sys: 0.00 s, total: 4.49 s
Wall time: 4.49 s
sage: %time test(a,b)
CPU times: user 8.99 s, sys: 0.11 s, total: 9.11 s
Wall time: 9.13 s
sage: c = a*b
sage: print c.parent()
C
sage: %time test(a,b)
CPU times: user 8.85 s, sys: 0.03 s, total: 8.88 s
Wall time: 8.89 s

In the first run of test(a,b), there is the possibility that parent(a*b) becomes created repeatedly, as I have demonstrated above. In the second run, parent(a*b) is kept alive, by keeping a pointer to the result of a*b. There is no significant difference in performance. Of course, we also see that multiplication within one parent is faster.

Is there a way to get the patchbots report significant regressions in some examples? Because I am sure that some parts of Sage (schemes and combinat are typical candidates) rely on a strong coercion cache.

comment:19 in reply to:  18 Changed 10 years ago by Nils Bruin

Replying to SimonKing:

By the way, it seems that the "insufficient" caching (not enough to keep C() alive) has no influence on performance, at least not if computations are done in a close cycle:

Ah right, because C._coerce_from_hash[A]._codomain is C, there's a cycle involving C so instead of getting deleted eagerly by Python's refcounting, it's only done with (relatively infrequent) GCs. So it even comes with a trick to improve this free "caching": Increase the interval for garbage collection. See gc.get_threshold and gc.set_threshold. By tweaking those numbers you can probably manipulate the running times in these examples.

NOTE: Doesn't timeit turn garbage collection off? So that may be a misleading tool for investigating performance for these things.

comment:20 Changed 10 years ago by Simon King

I think it makes sense to ask sage-devel whether it is better to fix this leak or not.

I believe that keeping A and B alive should prevent C from garbage collection. In contrast, keeping C alive should not prevent A and B from garbage collection. Currently, it is the other way around. Perhaps we should try to think through whether having a weak reference to the domain of coercion or conversion maps would work.

comment:21 Changed 10 years ago by Robert Bradshaw

+1 to taking this discussion to sage-devel.

Patchbot: apply 14058-weak-binop-morphisms.patch trac_14058-doctest.patch

comment:22 Changed 10 years ago by Simon King

The patchbot only reports one error, namely

File "/home/patchbot/sage-5.7.beta3/devel/sage-14058/sage/rings/number_field/number_field_ideal.py", line 118:
    sage: convert_to_idealprimedec_form(K, P)
Exception raised:
    Traceback (most recent call last):
      File "/home/patchbot/sage-5.7.beta3/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/patchbot/sage-5.7.beta3/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/patchbot/sage-5.7.beta3/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_2[5]>", line 1, in <module>
        convert_to_idealprimedec_form(K, P)###line 118:
    sage: convert_to_idealprimedec_form(K, P)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/rings/number_field/number_field_ideal.py", line 124, in convert_to_idealprimedec_form
        K_bnf = gp(field.pari_bnf())
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 201, in __call__
        return self._coerce_from_special_method(x)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 227, in _coerce_from_special_method
        return (x.__getattribute__(s))(self)
      File "sage_object.pyx", line 485, in sage.structure.sage_object.SageObject._gp_ (sage/structure/sage_object.c:4804)
      File "sage_object.pyx", line 450, in sage.structure.sage_object.SageObject._interface_ (sage/structure/sage_object.c:4144)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 199, in __call__
        return cls(self, x, name=name)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/expect.py", line 1280, in __init__
        self._name = parent._create(value, name=name)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/interface.py", line 389, in _create
        self.set(name, value)
      File "/home/patchbot/sage-5.7.beta3/local/lib/python/site-packages/sage/interfaces/gp.py", line 511, in set
        raise TypeError, "Error executing code in GP:\nCODE:\n\t%s\nPARI/GP ERROR:\n%s"%(cmd, out)
    TypeError: Error executing code in GP:
    CODE:
    	sage[14]=[[;], matrix(0,7), [;], Mat([0.E-38 + 3.52184386028273*I, 0.E-37 + 3.70366245659542*I, 0.E-37 + 2.91167081258838*I, 0.E-38 + 3.14159265358979*I, 0.E-38 + 9.04452675407645*I, 0.E-37 + 8.86270815776375*I, 0.E-37 + 9.65469980177079*I]), [[7, [-1, 2]~, 1, 1, [3, -2; 2, 1]], [13, [-5, 2]~, 1, 1, [-6, -2; 2, -8]], [19, [-3, 2]~, 1, 1, [5, -2; 2, 3]], [3, [1, 2]~, 2, 1, [1, 1; -1, 2]], [7, [3, 2]~, 1, 1, [-1, -2; 2, -3]], [13, [7, 2]~, 1, 1, [-5, -2; 2, -7]], [19, [5, 2]~, 1, 1, [-3, -2; 2, -5]]], 0, [y^2 + 3, [0, 1], -3, 2, [Mat([1, -0.500000000000000 - 0.866025403784439*I]), [1, -1.36602540378444; 1, 0.366025403784439], [1, -1; 1, 0], [2, -1; -1, -1], [3, 2; 0, 1], [1, -1; -1, -2], [3, [2, -1; 1, 1]]], [0.E-38 - 1.73205080756888*I], [1, 1/2*y - 1/2], [1, 1; 0, 2], [1, 0, 0, -1; 0, 1, 1, -1]], [[1, [], []], 1, 1, [6, -1/2*y + 1/2], []], [[;], [], []], [0, []]];
    PARI/GP ERROR:
      ***   at top-level: sage[14]=[[;],matrix(0,7
      ***                     ^--------------------
      ***   _[_]: not a vector.
  ***   Warning: precision too low for generators, not given.

That's obscure to me. What does it mean?

comment:23 Changed 10 years ago by Simon King

Unfortunately, I get

sage -t  "devel/sage-main/sage/rings/number_field/number_field_ideal.py"
         [21.9 s]
 
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 22.0 seconds

So, what happened here?

In any case, I think the example of collectable pushout should be included in the documentation, combined with a warning to keep a pointer to the pushout (or avoid cross-parent arithmetic altogether), if speed matters.

I could write that. But to what location? Logically, it belongs to sage.structure.coerce, but nobody would read that. It might better be in a more general tutorial. What do you suggest?

comment:24 Changed 10 years ago by Simon King

I did make ptest with a highly patched debug version of sage-5.7.beta2. Hence, I have the dependencies of #13864 and additionally #9107, #12951, #13916. #12313, #14058 and #13370 applied.

In the parallel test, I got two timeouts, namely in devel/sage/doc/en/bordeaux_2008/birds_other.rst and devel/sage/sage/interfaces/maxima.py. Running the former individually is long, but fine:

sage -t -force_lib "devel/sage/doc/en/bordeaux_2008/birds_other.rst"
         [339.8 s]
 
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 339.9 seconds

However, the maxima test is suspicious. I ran it twice individually. The first time I got an evil error:

sage -t -force_lib "devel/sage/sage/interfaces/maxima.py"   
**********************************************************************
File "/home/simon/SAGE/debug/sage-5.7.beta2/devel/sage/sage/interfaces/maxima.py", line 355:
    sage: T.tlimit('n','inf')
Exception raised:
    Traceback (most recent call last):
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_0[77]>", line 1, in <module>
        T.tlimit('n','inf')###line 355:
    sage: T.tlimit('n','inf')
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 588, in __call__
        return self._obj.parent().function_call(self._name, [self._obj] + list(args), kwds)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 489, in function_call
        return self.new(s)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 264, in new
        return self(code)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 199, in __call__
        return cls(self, x, name=name)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/maxima.py", line 1153, in __init__
        ExpectElement.__init__(self, parent, value, is_name=False, name=None)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/expect.py", line 1280, in __init__
        self._name = parent._create(value, name=name)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/interface.py", line 389, in _create
        self.set(name, value)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/maxima.py", line 998, in set
        self._eval_line(cmd)
      File "/home/simon/SAGE/debug/sage-5.7.beta2/local/lib/python/site-packages/sage/interfaces/maxima.py", line 754, in _eval_line
        assert line_echo.strip() == line.strip()
    AssertionError
**********************************************************************
1 items had failures:
   1 of  97 in __main__.example_0
***Test Failed*** 1 failures.
For whitespace errors, see the file /home/simon/.sage//tmp/maxima_30383.py
         [13.7 s]
 
----------------------------------------------------------------------
The following tests failed:


        sage -t -force_lib "devel/sage/sage/interfaces/maxima.py"
Total time for all tests: 13.8 seconds

and the second time I got

sage -t -force_lib "devel/sage/sage/interfaces/maxima.py"   
         [12.6 s]
 
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 12.6 seconds

Is the error something we should worry about? Or is that likely to be the effect of a neutrino hitting my laptop?

comment:25 Changed 10 years ago by Simon King

At #14159, I am providing a version of MonoDict and TripleDict that optionally allows to store the values by weak references. Provided that the performance is OK, Nils Bruin suggests to consider to use this here, hence, make #14159 a dependency. What do you think?

Last edited 10 years ago by Simon King (previous) (diff)

comment:26 Changed 9 years ago by Jeroen Demeyer

Milestone: sage-5.11sage-5.12

comment:27 Changed 9 years ago by Simon King

Note to myself: Work on this ticket...

comment:28 Changed 9 years ago by Simon King

Lost track again.

I tried to import the patches and turn them into git branches. It mostly worked, but I had to resolve some conflicts with the second patch.

Would it be OK to change this hg-ticket into a git-ticket?

comment:29 Changed 9 years ago by Simon King

Reviewers: Simon King
Status: needs_reviewneeds_work
Work issues: Fix doctests; decide between hg and git

While I had to resolve conflicts when importing the two patches into the git master branch, the patches do apply to sage-5.12.beta5, and the patchbot reports the same on sage-5.13.beta2. So, perhaps it is OK to keep this a mercurial ticket.

However, with the gitification of the patches, I find (similar to the patchbot)

sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/rings/polynomial/plural.pyx  # 17 doctests failed
sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/libs/singular/function.pyx  # 1 doctest failed

The patchbot additionally finds

sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/algebras/free_algebra.py  # 1 doctest failed
sage -t --long /mnt/storage2TB/patchbot/Sage/sage-5.13.beta2/devel/sage/sage/structure/coerce.pyx  # 1 doctest failed

So, this requires some work.

comment:30 Changed 9 years ago by For batch modifications

Milestone: sage-6.1sage-6.2

comment:31 Changed 9 years ago by For batch modifications

Milestone: sage-6.2sage-6.3

comment:32 Changed 8 years ago by For batch modifications

Milestone: sage-6.3sage-6.4

comment:33 Changed 7 years ago by Simon King

I plan to create a branch for it. Probably it will fix #18905.

comment:34 Changed 7 years ago by Simon King

Branch: u/SimonKing/weakly_reference_binary_operation_codomains

comment:35 Changed 7 years ago by Simon King

Commit: b9141eef1042c8b8fb1179b30844d5bdb737b96e
Work issues: Fix doctests; decide between hg and gitFix doctests

Here is a branch. The merge was not trivial, so, hopefully I did it well.

Next, we have to see if the "old problem" (i.e., attachment:testcoercion.py) is still problematic, and of course there may be doctest failures.

At least, the issue from comment:20 is solved: We do have weak references to the domain of coercion maps now.


New commits:

ac68d9bRemove strong references to parents used in binary operations in the coercion model.
e153d08#14058: Add doctest
b9141eeMerge branch 'ticket/14058' into develop

comment:36 Changed 7 years ago by Simon King

sage -t src/sage/algebras/commutative_dga.py  # 3 doctests failed
sage -t src/sage/algebras/free_algebra.py  # 1 doctest failed
sage -t src/sage/rings/polynomial/plural.pyx  # 17 doctests failed
sage -t src/sage/dev/sagedev.py  # 2 doctests failed
sage -t src/sage/structure/coerce.pyx  # 2 doctests failed
sage -t src/sage/libs/singular/function.pyx  # 2 doctests failed

Not bad (and the sagedev error very likely is because I have rerere enabled, hence, it is unrelated anyway).

Last edited 7 years ago by Simon King (previous) (diff)

comment:37 Changed 7 years ago by Simon King

Details:

sage -t src/sage/libs/singular/function.pyx
**********************************************************************
File "src/sage/libs/singular/function.pyx", line 413, in sage.libs.singular.function.is_singular_poly_wrapper
Failed example:
    H.<x,y,z> = A.g_algebra({z*x:x*z+2*x, z*y:y*z-2*y})
Expected nothing
Got:
    Exception KeyError: (The ring pointer 0x7f7c7470c8d0,) in 'sage.libs.singular.ring.singular_ring_delete' ignored
**********************************************************************
File "src/sage/libs/singular/function.pyx", line 1250, in sage.libs.singular.function.SingularFunction.__call__
Failed example:
    I = Ideal(I.groebner_basis())
Expected nothing
Got:
    Exception KeyError: (The ring pointer 0x7f7c7470ca40,) in 'sage.libs.singular.ring.singular_ring_delete' ignored
**********************************************************************

--> It seems that the cache became too weak. All errors in sage -t src/sage/rings/polynomial/plural.pyx and sage -t src/sage/algebras/commutative_dga.py are of that form.

sage -t src/sage/structure/coerce.pyx
**********************************************************************
File "src/sage/structure/coerce.pyx", line 418, in sage.structure.coerce.CoercionModel_cache_maps.get_cache
Failed example:
    print copy(left_morphism_ref())
Exception raised:
    Traceback (most recent call last):
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.structure.coerce.CoercionModel_cache_maps.get_cache[4]>", line 1, in <module>
        print copy(left_morphism_ref())
    NameError: name 'left_morphism_ref' is not defined
**********************************************************************
File "src/sage/structure/coerce.pyx", line 422, in sage.structure.coerce.CoercionModel_cache_maps.get_cache
Failed example:
    print right_morphism_ref
Exception raised:
    Traceback (most recent call last):
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.structure.coerce.CoercionModel_cache_maps.get_cache[5]>", line 1, in <module>
        print right_morphism_ref
    NameError: name 'right_morphism_ref' is not defined
**********************************************************************

--> I got something wrong in my merge.

So, we actually just have two errors.

Last edited 7 years ago by Simon King (previous) (diff)

comment:38 Changed 7 years ago by git

Commit: b9141eef1042c8b8fb1179b30844d5bdb737b96e90ed181ae9df152e99a652636a7e9dc29b466916

Branch pushed to git repo; I updated commit sha1. New commits:

90ed181Trivial fix for a coercion doctest

comment:39 Changed 7 years ago by Simon King

#13447 is probably related. Perhaps (if we can finish the work here) #13447 can eventually be closed as a duplicate.

comment:40 Changed 7 years ago by Simon King

In the following example

            sage: import gc
            sage: from sage.rings.polynomial.plural import NCPolynomialRing_plural
            sage: from sage.algebras.free_algebra import FreeAlgebra
            sage: A1.<x,y,z> = FreeAlgebra(QQ, 3)
            sage: R1 = A1.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: A2.<x,y,z> = FreeAlgebra(GF(5), 3)
            sage: R2 = A2.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: A3.<x,y,z> = FreeAlgebra(GF(11), 3)
            sage: R3 = A3.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: A4.<x,y,z> = FreeAlgebra(GF(13), 3)
            sage: R4 = A4.g_algebra({y*x:x*y-z, z*x:x*z+2*x, z*y:y*z-2*y}, order=TermOrder('degrevlex', 2))
            sage: _ = gc.collect()
            sage: foo = R1.gen(0)
            sage: del foo
            sage: del R1
            sage: _ = gc.collect()
            sage: del R2
            sage: _ = gc.collect()
            sage: del R3
            sage: _ = gc.collect()

the reference counting goes wrong after each gc.collect().

comment:41 Changed 7 years ago by git

Commit: 90ed181ae9df152e99a652636a7e9dc29b46691664b572ccf6403be0c617d0e39de4b1e6cd5dbc58

Branch pushed to git repo; I updated commit sha1. New commits:

64b572crefcount libsingular rings used in plural

comment:42 Changed 7 years ago by Simon King

Got it! When a non-commutative ring is created, it was forgotten to put the underlying libsingular ring into Sage's refcounting system. Should be fixed now. I'll see if #13447 is fixed now, too.

comment:43 Changed 7 years ago by Simon King

Authors: Robert BradshawRobert Bradshaw, Nils Bruin
Status: needs_workneeds_review
Work issues: Fix doctests

Sigh. #13447 is not fixed, as the following example shows.

sage: import gc
sage: _ = gc.collect()
sage: from sage.rings.polynomial.multi_polynomial_libsingular import MPolynomialRing_libsingular
sage: from sage.libs.singular.groebner_strategy import GroebnerStrategy
sage: n = len([x for x in gc.get_objects() if isinstance(x,MPolynomialRing_libsingular)])
sage: P = MPolynomialRing_libsingular(GF(541), 2, ('x', 'y'), TermOrder('degrevlex', 2))
sage: strat = GroebnerStrategy(Ideal([P.gen(0) + P.gen(1)]))
sage: del strat
sage: del P
sage: _ = gc.collect()
sage: n == len([x for x in gc.get_objects() if isinstance(x,MPolynomialRing_libsingular)])
False

I'd suggest to proceed as follows. Provided that all tests pass, I can review Robert's and Nils' patches, someone else reviews my additions (which are the merging and the latest commit), and then we make this a dependency for #13447.

Last edited 7 years ago by Simon King (previous) (diff)

comment:44 Changed 7 years ago by git

Commit: 64b572ccf6403be0c617d0e39de4b1e6cd5dbc589793dbc9bc64bfdce43ff1e626453816c302696d

Branch pushed to git repo; I updated commit sha1. New commits:

9793dbcMake one test more stable

comment:45 Changed 7 years ago by Simon King

Sometimes, the following test in sage/categories/fields.py has failed:

            sage: import gc
            sage: _ = gc.collect()
            sage: n = len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
            sage: for i in prime_range(100):
            ....:     R = ZZ.quotient(i)
            ....:     t = R in Fields()
            sage: _ = gc.collect()
            sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
            1

The reason for the failure: Sometimes the last line gave 0, not 1. It is not totally clear to me why the different results occurred. But in fact 0 is better than 1 here (one additional ring got collected). So, I hope my change is OK.

comment:46 Changed 7 years ago by Simon King

I am giving a positive review to Robert's code and Nils' doctest. I consider my contribution too small to make me author, but I think my changes need a review.

comment:47 in reply to:  45 ; Changed 7 years ago by Peter Bruin

Replying to SimonKing:

Sometimes, the following test in sage/categories/fields.py has failed:

            sage: import gc
            sage: _ = gc.collect()
            sage: n = len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
            sage: for i in prime_range(100):
            ....:     R = ZZ.quotient(i)
            ....:     t = R in Fields()
            sage: _ = gc.collect()
            sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
            1

The reason for the failure: Sometimes the last line gave 0, not 1. It is not totally clear to me why the different results occurred. But in fact 0 is better than 1 here (one additional ring got collected).

Shouldn't the result always be 1 given that R refers to Zmod(97) after the loop (unless there is a surviving reference from a Zmod(97) created in some earlier test)? What happens if you do del R before the last gc.collect()?

comment:48 in reply to:  47 Changed 7 years ago by Simon King

Replying to pbruin:

Shouldn't the result always be 1 given that R refers to Zmod(97) after the loop (unless there is a surviving reference from a Zmod(97) created in some earlier test)?

Exactly. And what's strange: It isn't reproducible. Sometimes I get 1, sometimes I get 0. And I always get 1 when I run it on the command line.

What happens if you do del R before the last gc.collect()?

No idea.

comment:49 Changed 7 years ago by Simon King

To be precise: I saw this problem precisely once. When I run the previous test about 20 times, it always passes.

comment:50 Changed 7 years ago by git

Commit: 9793dbc9bc64bfdce43ff1e626453816c302696d2c04029f9a194bd87befa0cd99016e2f267a836f

Branch pushed to git repo; I updated commit sha1. New commits:

2c04029make a test against a memory leak clearer/more stable

comment:51 Changed 7 years ago by Simon King

Hopefully the new version of the test is reliable.

comment:52 Changed 7 years ago by Frédéric Chapoton

Status: needs_reviewneeds_work

excerpt from patchbot report:

+inside file:  b/src/sage/structure/coerce.pyx
+@@ -1265,33 +1287,74 @@ cdef class CoercionModel_cache_maps(CoercionModel):
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
+Old-style raise statement inserted on 2 non-empty lines

comment:53 in reply to:  52 Changed 7 years ago by Simon King

Replying to chapoton:

excerpt from patchbot report:

+inside file:  b/src/sage/structure/coerce.pyx
+@@ -1265,33 +1287,74 @@ cdef class CoercionModel_cache_maps(CoercionModel):
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
++                    raise RuntimeError, "BUG in coercion model: coerce_map_from must return a Map"
+Old-style raise statement inserted on 2 non-empty lines

Actually this was not inserted, but moved. Would it be possible for you to change this to new-style raise statement by a reviewer commit?

comment:54 Changed 7 years ago by Frédéric Chapoton

Branch: u/SimonKing/weakly_reference_binary_operation_codomainspublic/ticket/14058
Commit: 2c04029f9a194bd87befa0cd99016e2f267a836f8713960717f916d459e0b6a0dcef391dc992d9b2

done. I also did it for some of the other raise in the same file.


New commits:

e976b79Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0
8713960trac #14058 convert some raise statements into py3 syntax

comment:55 Changed 7 years ago by Frédéric Chapoton

Status: needs_workneeds_review

comment:56 in reply to:  46 Changed 7 years ago by Simon King

Replying to SimonKing:

I am giving a positive review to Robert's code and Nils' doctest. I consider my contribution too small to make me author, but I think my changes need a review.

Could someone please finish the review? The patchbot thinks the branch is fine.

comment:57 Changed 7 years ago by Jean-Pierre Flori

Description: modified (diff)
Reviewers: Simon KingSimon King, Frédéric Chapoton, Jean-Pierre Flori
Status: needs_reviewpositive_review

Your change is small and makes sense.

comment:58 Changed 7 years ago by Vincent Delecroix

Status: positive_reviewneeds_work

From the patchbot

sage -t --long src/sage/structure/coerce.pyx
**********************************************************************
File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps
Failed example:
    print N2-N0
Expected:
    0
Got:
    -1

Is it what we should get?

comment:59 in reply to:  58 ; Changed 7 years ago by Vincent Delecroix

Status: needs_workneeds_review

Replying to vdelecroix:

From the patchbot ...

wrong ticket... should be #18905. Sorry for the noise.

comment:60 in reply to:  59 Changed 7 years ago by Simon King

Replying to vdelecroix:

Replying to vdelecroix:

From the patchbot ...

wrong ticket... should be #18905. Sorry for the noise.

The patchbot HERE does report this problem. So, what do you mean that it "should be #18905"?

comment:61 Changed 7 years ago by Vincent Delecroix

Status: needs_reviewneeds_work

Right! I confused myself.

comment:62 Changed 7 years ago by Vincent Delecroix

The failing doctest is reproducible after applying this branch onto 6.9.beta1.

comment:63 Changed 7 years ago by Sébastien Labbé

On sage-6.9.beta1, I get:

$ sage nbruin.sage
Time: CPU 7.95 s, Wall: 7.95 s
Time: CPU 7.68 s, Wall: 7.68 s
Time: CPU 76.76 s, Wall: 76.94 s

On sage-6.9.beta1 + #14058, I get a very good improvement:

$ sage nbruin.sage
Time: CPU 2.48 s, Wall: 2.48 s
Time: CPU 2.07 s, Wall: 2.07 s
Time: CPU 32.38 s, Wall: 32.42 s

Note: I originally thought it was a regression in sage since I was getting this earlier this morning with sage-6.8 + #14058. But thanksfully, it seems to be a improvement obtained from this ticket.

The file nbruin.sage is from comment 4:

 $ cat nbruin.sage
a=ZZ(1)
b=QQ(1)
c=ZZ['x'](1)
d=b+c

def test(x,y):
  for i in xrange(10^6):
    _=x+y

time test(a,b)
time test(a,c)
time test(b,c)

comment:64 Changed 7 years ago by Sébastien Labbé

I also get improvements on tests above of Simon King at comments 18.

With sage-6.9.beta1:

$ sage testcoercion.sage
Time: CPU 1.69 s, Wall: 1.69 s
Time: CPU 4.05 s, Wall: 4.05 s
C
Time: CPU 4.03 s, Wall: 4.03 s

With sage-6.9.beta1 + this ticket:

$ sage testcoercion.sage
Time: CPU 0.34 s, Wall: 0.34 s
Time: CPU 0.85 s, Wall: 0.85 s
C
Time: CPU 0.86 s, Wall: 0.86 s

(My file testcoercion.sage contains testcoercion.py + the lines at comment 18.)

comment:65 Changed 7 years ago by Sébastien Labbé

With sage-6.9.beta1 + this ticket, I get All tests passed when I do

./sage -t --long src/sage/structure/coerce.pyx

but I get the problem when I test after the build

./sage -bt --long src/sage/structure/coerce.pyx

which gives the error:

**********************************************************************
File "src/sage/structure/coerce.pyx", line 1307, in sage.structure.coerce.CoercionModel_cache_maps.coercion_maps
Failed example:
    print N2-N0
Expected:
    0
Got:
    -1
**********************************************************************

I have never seen such a behavior.

comment:66 Changed 7 years ago by git

Commit: 8713960717f916d459e0b6a0dcef391dc992d9b2b73d5372cd889d8b5e72e0ac56a781ba4dde9761

Branch pushed to git repo; I updated commit sha1. New commits:

c136ebdMerge #14058 into 6.9.beta1
b73d537Trac #14058: fix broken doctest

comment:67 Changed 7 years ago by git

Commit: b73d5372cd889d8b5e72e0ac56a781ba4dde976113cb2a78983efb8f4edb00d6d10f8fc7da62e80d

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

e153d08#14058: Add doctest
b9141eeMerge branch 'ticket/14058' into develop
90ed181Trivial fix for a coercion doctest
64b572crefcount libsingular rings used in plural
9793dbcMake one test more stable
2c04029make a test against a memory leak clearer/more stable
e976b79Merge branch 'u/SimonKing/weakly_reference_binary_operation_codomains' into 6.9.b0
8713960trac #14058 convert some raise statements into py3 syntax
c136ebdMerge #14058 into 6.9.beta1
13cb2a7Trac #14058: fix broken doctest

comment:68 Changed 7 years ago by Sébastien Labbé

In other doctests in the same file, GF(3), GF(7) and GF(41) are also created. While there are references to GF(7) and GF(41), no variable references to GF(3) which is available for gargage collection. If GF(3) gets garbage collected before the offending doctest, then everything is fine, the doctest pass. But it can also be garbage collected in the call to gc.collect() in the doctest. If this is the case, then N0 was also counting GF(3) which explains why N2-N0 can be negative.

One way to fix this is to call gc.collect() before creating N0.

(my first fix was using sets of ids... sorry for the noise).

comment:69 in reply to:  68 Changed 7 years ago by Simon King

Replying to slabbe:

One way to fix this is to call gc.collect() before creating N0.

I agree, that's the way to fix it.

comment:70 Changed 7 years ago by Sébastien Labbé

Status: needs_workneeds_review

comment:71 Changed 7 years ago by Jean-Pierre Flori

Status: needs_reviewpositive_review

I assume Simon's comment gives positive review here as this was the only thing to fix.

comment:72 Changed 7 years ago by Jean-Pierre Flori

Reviewers: Simon King, Frédéric Chapoton, Jean-Pierre FloriSimon King, Frédéric Chapoton, Jean-Pierre Flori, Sébastien Labbé

comment:73 Changed 7 years ago by Frédéric Chapoton

Milestone: sage-6.4sage-6.9

comment:74 Changed 7 years ago by Volker Braun

Dependencies: #12313

comment:75 Changed 7 years ago by Volker Braun

Branch: public/ticket/1405813cb2a78983efb8f4edb00d6d10f8fc7da62e80d
Resolution: fixed
Status: positive_reviewclosed

comment:76 Changed 7 years ago by Jeroen Demeyer

Commit: 13cb2a78983efb8f4edb00d6d10f8fc7da62e80d

On arando:

sage -t --long src/sage/categories/fields.py
**********************************************************************
File "src/sage/categories/fields.py", line 104, in sage.categories.fields.Fields.__contains__
Failed example:
    len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]) - n
Expected:
    0
Got:
    -1
**********************************************************************

comment:77 Changed 7 years ago by Jeroen Demeyer

See #19244

Note: See TracTickets for help on using tickets.