Opened 7 years ago

Closed 7 years ago

#15424 closed defect (duplicate)

A coercion-related memory leak

Reported by: SimonKing Owned by:
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: memleak Keywords: coercion model, weak reference
Cc: nbruin Merged in:
Authors: Reviewers: Simon King
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

Yet another leak (this test is with #15303):

sage: import gc
sage: K = IntegerModRing(111115)
sage: C = type(K)
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)])
1
sage: K.has_coerce_map_from(ZZ)
True
sage: del K
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak
0
sage: a = K.get_action(ZZ, op=operator.add, self_on_left=True)
sage: del K,a
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)])
0
sage: K = IntegerModRing(111115)
sage: a = K.get_action(ZZ, op=operator.mul, self_on_left=True)
sage: b = ZZ.get_action(K, op=operator.mul, self_on_left=False)
sage: del K,a,b
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak
0
sage: K = IntegerModRing(111115)
sage: x = K.one()
sage: del K,x
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak
0
sage: K = IntegerModRing(111115)
sage: x = K.one()*2
sage: del K,x
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # LEAK
1

This is astonishing. Wouldn't one think that getting (thus, caching) a coercion or getting an action should trigger a leak, or creating an element? No, we actually need to multiply two elements to find the leak.

Surprising:

sage: K = IntegerModRing(111115)
sage: K.get_action(ZZ, op=operator.mul, self_on_left=True) is None
True
sage: ZZ.get_action(K, op=operator.mul, self_on_left=True) is None
True

Shouldn't there be an action? OK, perhaps not, since coercion is used for the multiplication:

sage: cm = sage.structure.element.get_coercion_model()
sage: cm.explain(K,ZZ, op=operator.mul)
Coercion on right operand via
    Natural morphism:
      From: Integer Ring
      To:   Ring of integers modulo 111115
Arithmetic performed after coercions.
Result lives in Ring of integers modulo 111115
Ring of integers modulo 111115

But if coercion is used, then why is establishing a coercion not enough to trigger the leak? In a new session:

sage: import gc
sage: K = IntegerModRing(111115)
sage: C = type(K)
sage: phi = K.coerce_map_from(ZZ)
sage: x = phi(2)
sage: del K,phi,x
sage: _ = gc.collect()
sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)])
0

Attachments (1)

chain.png (36.4 KB) - added by SimonKing 7 years ago.
A reference chain preventing an integer mod ring from garbage collection

Download all attachments as: .zip

Change History (22)

comment:1 follow-up: Changed 7 years ago by nbruin

I tried to see where the reference can be coming from:

%cpaste
import gc
from sage.structure.coerce_dict import *
def all_referrers(c,X):
    found_IDs=set()
    R=[]
    new=gc.get_referrers(c)
    found_IDs.add(id(new))
    found_IDs.add(id(R))
    found_IDs.add(id(globals()))    
    while len(new)>0:
        r=new.pop()
        if id(r) in found_IDs or type(r) not in X:
            print "skipping",type(r)
            continue
        R.append(r)
        found_IDs.add(id(r))
        new.extend(gc.get_referrers(r))
        print "type(r)=%s len(R)=%s len(new)=%s"%(type(r),len(R),len(new))
    return R

def getR():
    K = IntegerModRing(111115)
    C = type(K)
    phi = K.coerce_map_from(ZZ)
    del phi
    #vary this line:
    x = K.one()*2
    #del K,x
    _ = gc.collect()
    X=set([list,dict,tuple,
           RingHomset_generic_with_category,
           sage.rings.finite_rings.integer_mod.Integer_to_IntegerMod,
           TripleDict, MonoDict, C])
    R=all_referrers(list(c for c in gc.get_objects() if isinstance(c,C))[0],X)
    return R
--
RingHomset_generic_with_category=type(ZZ.Hom(QQ))
R=getR()

and I found that with x=K.one()*2 there is also a TripleDict that show up. It's fairly big (44 entries) and the entries all seem to be of the form D[domain,codomain,None]=morphism from domain to codomain (or None entries). Since the morphism has a strong reference to the codomain, this would keep our ring alive [it doesn't seem like the kind of dictionary that can afford to be weak on its values]. Judging from the entries, this dictionary is a global one. I think there is a ticket somewhere that mentions the same phenomenon (also doing arithmetic), in the context of finite fields. Somewhere from the era when we started to work on #715 in earnest.

comment:2 in reply to: ↑ 1 Changed 7 years ago by SimonKing

Replying to nbruin:

and I found that with x=K.one()*2 there is also a TripleDict that show up. It's fairly big (44 entries) and the entries all seem to be of the form D[domain,codomain,None]=morphism from domain to codomain (or None entries). Since the morphism has a strong reference to the codomain, this would keep our ring alive [it doesn't seem like the kind of dictionary that can afford to be weak on its values]. Judging from the entries, this dictionary is a global one.

Hm. Is there any global TripleDict beside the one in sage.categories.homset? With grep, I found none. But in this case, the values would not be morphisms but homsets.

comment:3 follow-up: Changed 7 years ago by SimonKing

I just notice: #14711 is still in need of review. I am currently building the branch from there, to see if it fixes the problem from here.

comment:4 in reply to: ↑ 3 Changed 7 years ago by SimonKing

Replying to SimonKing:

I just notice: #14711 is still in need of review. I am currently building the branch from there, to see if it fixes the problem from here.

No, it does not.

comment:5 Changed 7 years ago by nbruin

OK, looking at the referrers to the TripleDict? I get a sage.structure.coerce.CoercionModel_cache_maps sage.structure.coerce.CoercionModel_cache_maps, so I guess it's not a "global" cache but one stored on an object that has an awfully long lifespan.

Correction: the values are either None or a pair of maps. So I guess the keys are not domain and codomain but parents of pairs of elements and the pair of maps are the ones that map into the common parent.

So my guess is that this cache is used when determining a common parent and then stores the maps needed to get to that. It seems to me we can mitigate this leak considerably if we DON'T store the maps here if we find a coercion from one into the other would do the job: instead store a symbolic "coerce_to_first" or "coerce_to_second" value in there.

In cases where there is a genuine third parent into which we're mapping, it's a little safer: then the codomain at least doesn't keep alive the keys. It's unclear to me how long we should be keeping the common overparent in that case. This cache will mean its life is bounded below by the shortest life time of the two "covered" parents.

Changed 7 years ago by SimonKing

A reference chain preventing an integer mod ring from garbage collection

comment:6 Changed 7 years ago by SimonKing

The attached picture shows that the offending reference chain starts in the module sage.functions.other (which I have never even heard of) and proceeds (via some __dict__ of 73 items) to CoercionModel_cache_maps, which then references a TripleDict.

comment:7 Changed 7 years ago by SimonKing

Indeed, sage.functions.other contains the line coercion_model = sage.structure.element.get_coercion_model(). But I think keeping a reference to the coercion model must be legal. Hence I need to look further.

comment:8 Changed 7 years ago by SimonKing

Ouch. I thought that in our coercion model, maps and actions are cached on the level of parents. Now I see that they additionally are cached in the coercion model!

Apparently this additional cache is not always used. For example, it is not used if you just do K.coerce_map_from(ZZ). But it seems that a coercion map is stored in the additional cache if you do a multiplication.

And then you have a strongly referenced TripleDict, a key (ZZ,K,None) and a morphism phi:ZZ->K as value. Even with #14711, phi would strongly reference K. ZZ is immortal. Hence, the callback for the item (ZZ,K,None):phi will never be called.

What shall we do about it? Isn't it the case that we are only storing maps that already are cached on the level of parents? Then it would be safe to just store a weak reference to this map, and the problem was solved.

comment:9 Changed 7 years ago by SimonKing

PS: Similarly for CoercionModel_cache_maps._action_maps.

comment:10 Changed 7 years ago by SimonKing

PPS: Which means that we need to make maps weakrefable (actions already are weakrefable).

comment:11 Changed 7 years ago by SimonKing

  • Branch set to u/SimonKing/ticket/15424
  • Created changed from 11/15/13 16:35:24 to 11/15/13 16:35:24
  • Modified changed from 11/16/13 19:29:18 to 11/16/13 19:29:18

comment:12 Changed 7 years ago by SimonKing

  • Authors set to Simon King
  • Commit set to b052e5eef195a755a7eebeac931bd64d936348b8
  • Keywords coercion model weak reference added
  • Status changed from new to needs_review

I did not run the complete doctests, but I have added a new test that shows that I have fixed the problem. It is orthogonal to #14711.

As I said above: The idea is to only keep a weak reference from coercion model to coerce maps, which needs to make morphisms weakrefable, and which should be safe because the maps are cached on their dodomain anyway.


New commits:

b052e5eCoercion model should only store weak references to coerce maps

comment:13 Changed 7 years ago by SimonKing

  • Work issues set to Convince elliptic curves to not fail

Argh. Schemes seem to hate me.

Almost always when I implement something that fixes a problem, it turns out that some examples from sage.schemes will fail, because the elliptic curve code relies on what I intended to fix.

But this time it is extreme:

sage -t src/sage/schemes/elliptic_curves/ell_rational_field.py  # 652 doctests failed

comment:14 Changed 7 years ago by SimonKing

  • Status changed from needs_review to needs_work

comment:15 follow-up: Changed 7 years ago by nbruin

Don't store weakrefs as values in TripleDict and MonoDict. That's what we have weakvalues=True for. Or are you afraid the callback is too expensive to handle?

comment:16 in reply to: ↑ 15 ; follow-up: Changed 7 years ago by SimonKing

Replying to nbruin:

Don't store weakrefs as values in TripleDict and MonoDict. That's what we have weakvalues=True for.

Oops, I forgot about this.

Or are you afraid the callback is too expensive to handle?

No. CoercionModel_cache_maps._coercion_maps stores tuples (namely: pairs of coercion maps, in both directions, hence, often (mor,None) or (None,mor)). We can't weakref the tuples. However, we could have a weakrefable extension type with two cdef attributes, so that we can store this instead.

comment:17 in reply to: ↑ 16 Changed 7 years ago by SimonKing

Replying to SimonKing:

We can't weakref the tuples. However, we could have a weakrefable extension type with two cdef attributes, so that we can store this instead.

No, this won't work, because there would be no reference back to the pair, and thus it would be immediately garbage collected.

comment:18 follow-up: Changed 7 years ago by nbruin

Found it: #14058 seems relevant. Especially read the sage-devel thread referenced there. We've thought quite a bit about this stuff before. Especially with the newly MUCH better lookup performance of TripleDict and MonoDict, we may be able to afford some more lookup indirection.

Last edited 7 years ago by nbruin (previous) (diff)

comment:19 in reply to: ↑ 18 Changed 7 years ago by SimonKing

Replying to nbruin:

Found it: #14058 seems relevant. Especially read the sage-devel thread referenced there. We've thought quite a bit about this stuff before. Especially with the newly MUCH better lookup performance of TripleDict and MonoDict, we may be able to afford some more lookup indirection.

OK, then let us see if it fixes the problem.

comment:20 Changed 7 years ago by SimonKing

  • Authors Simon King deleted
  • Milestone changed from sage-5.13 to sage-duplicate/invalid/wontfix
  • Reviewers set to Simon King
  • Status changed from needs_work to positive_review
  • Work issues Convince elliptic curves to not fail deleted

Yep, #14058 fixes the problem. Then I give a review to this ticket, saying that it is a duplicate and please be resolved as such.

comment:21 Changed 7 years ago by jdemeyer

  • Branch u/SimonKing/ticket/15424 deleted
  • Commit b052e5eef195a755a7eebeac931bd64d936348b8 deleted
  • Resolution set to duplicate
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.