Opened 7 years ago
Closed 7 years ago
#15424 closed defect (duplicate)
A coercion-related memory leak
Reported by: | SimonKing | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | memleak | Keywords: | coercion model, weak reference |
Cc: | nbruin | Merged in: | |
Authors: | Reviewers: | Simon King | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Yet another leak (this test is with #15303):
sage: import gc sage: K = IntegerModRing(111115) sage: C = type(K) sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) 1 sage: K.has_coerce_map_from(ZZ) True sage: del K sage: _ = gc.collect() sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak 0 sage: a = K.get_action(ZZ, op=operator.add, self_on_left=True) sage: del K,a sage: _ = gc.collect() sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) 0 sage: K = IntegerModRing(111115) sage: a = K.get_action(ZZ, op=operator.mul, self_on_left=True) sage: b = ZZ.get_action(K, op=operator.mul, self_on_left=False) sage: del K,a,b sage: _ = gc.collect() sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak 0 sage: K = IntegerModRing(111115) sage: x = K.one() sage: del K,x sage: _ = gc.collect() sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # no leak 0 sage: K = IntegerModRing(111115) sage: x = K.one()*2 sage: del K,x sage: _ = gc.collect() sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) # LEAK 1
This is astonishing. Wouldn't one think that getting (thus, caching) a coercion or getting an action should trigger a leak, or creating an element? No, we actually need to multiply two elements to find the leak.
Surprising:
sage: K = IntegerModRing(111115) sage: K.get_action(ZZ, op=operator.mul, self_on_left=True) is None True sage: ZZ.get_action(K, op=operator.mul, self_on_left=True) is None True
Shouldn't there be an action? OK, perhaps not, since coercion is used for the multiplication:
sage: cm = sage.structure.element.get_coercion_model() sage: cm.explain(K,ZZ, op=operator.mul) Coercion on right operand via Natural morphism: From: Integer Ring To: Ring of integers modulo 111115 Arithmetic performed after coercions. Result lives in Ring of integers modulo 111115 Ring of integers modulo 111115
But if coercion is used, then why is establishing a coercion not enough to trigger the leak? In a new session:
sage: import gc sage: K = IntegerModRing(111115) sage: C = type(K) sage: phi = K.coerce_map_from(ZZ) sage: x = phi(2) sage: del K,phi,x sage: _ = gc.collect() sage: len([1 for bla in gc.get_objects() if isinstance(bla,C)]) 0
Attachments (1)
Change History (22)
comment:1 follow-up: ↓ 2 Changed 7 years ago by
comment:2 in reply to: ↑ 1 Changed 7 years ago by
Replying to nbruin:
and I found that with
x=K.one()*2
there is also a TripleDict that show up. It's fairly big (44 entries) and the entries all seem to be of the formD[domain,codomain,None]=morphism from domain to codomain
(orNone
entries). Since the morphism has a strong reference to the codomain, this would keep our ring alive [it doesn't seem like the kind of dictionary that can afford to be weak on its values]. Judging from the entries, this dictionary is a global one.
Hm. Is there any global TripleDict
beside the one in sage.categories.homset
? With grep, I found none. But in this case, the values would not be morphisms but homsets.
comment:3 follow-up: ↓ 4 Changed 7 years ago by
I just notice: #14711 is still in need of review. I am currently building the branch from there, to see if it fixes the problem from here.
comment:4 in reply to: ↑ 3 Changed 7 years ago by
comment:5 Changed 7 years ago by
OK, looking at the referrers to the TripleDict? I get a sage.structure.coerce.CoercionModel_cache_maps sage.structure.coerce.CoercionModel_cache_maps
, so I guess it's not a "global" cache but one stored on an object that has an awfully long lifespan.
Correction: the values are either None
or a pair of maps. So I guess the keys are not domain and codomain but parents of pairs of elements and the pair of maps are the ones that map into the common parent.
So my guess is that this cache is used when determining a common parent and then stores the maps needed to get to that. It seems to me we can mitigate this leak considerably if we DON'T store the maps here if we find a coercion from one into the other would do the job: instead store a symbolic "coerce_to_first" or "coerce_to_second" value in there.
In cases where there is a genuine third parent into which we're mapping, it's a little safer: then the codomain at least doesn't keep alive the keys. It's unclear to me how long we should be keeping the common overparent in that case. This cache will mean its life is bounded below by the shortest life time of the two "covered" parents.
comment:6 Changed 7 years ago by
The attached picture shows that the offending reference chain starts in the module sage.functions.other
(which I have never even heard of) and proceeds (via some __dict__
of 73 items) to CoercionModel_cache_maps
, which then references a TripleDict
.
comment:7 Changed 7 years ago by
Indeed, sage.functions.other
contains the line coercion_model = sage.structure.element.get_coercion_model()
. But I think keeping a reference to the coercion model must be legal. Hence I need to look further.
comment:8 Changed 7 years ago by
Ouch. I thought that in our coercion model, maps and actions are cached on the level of parents. Now I see that they additionally are cached in the coercion model!
Apparently this additional cache is not always used. For example, it is not used if you just do K.coerce_map_from(ZZ)
. But it seems that a coercion map is stored in the additional cache if you do a multiplication.
And then you have a strongly referenced TripleDict
, a key (ZZ,K,None)
and a morphism phi:ZZ->K
as value. Even with #14711, phi
would strongly reference K
. ZZ
is immortal. Hence, the callback for the item (ZZ,K,None):phi
will never be called.
What shall we do about it? Isn't it the case that we are only storing maps that already are cached on the level of parents? Then it would be safe to just store a weak reference to this map, and the problem was solved.
comment:9 Changed 7 years ago by
PS: Similarly for CoercionModel_cache_maps._action_maps
.
comment:10 Changed 7 years ago by
PPS: Which means that we need to make maps weakrefable (actions already are weakrefable).
comment:11 Changed 7 years ago by
- Branch set to u/SimonKing/ticket/15424
- Created changed from 11/15/13 16:35:24 to 11/15/13 16:35:24
- Modified changed from 11/16/13 19:29:18 to 11/16/13 19:29:18
comment:12 Changed 7 years ago by
- Commit set to b052e5eef195a755a7eebeac931bd64d936348b8
- Keywords coercion model weak reference added
- Status changed from new to needs_review
I did not run the complete doctests, but I have added a new test that shows that I have fixed the problem. It is orthogonal to #14711.
As I said above: The idea is to only keep a weak reference from coercion model to coerce maps, which needs to make morphisms weakrefable, and which should be safe because the maps are cached on their dodomain anyway.
New commits:
b052e5e | Coercion model should only store weak references to coerce maps |
comment:13 Changed 7 years ago by
- Work issues set to Convince elliptic curves to not fail
Argh. Schemes seem to hate me.
Almost always when I implement something that fixes a problem, it turns out that some examples from sage.schemes will fail, because the elliptic curve code relies on what I intended to fix.
But this time it is extreme:
sage -t src/sage/schemes/elliptic_curves/ell_rational_field.py # 652 doctests failed
comment:14 Changed 7 years ago by
- Status changed from needs_review to needs_work
comment:15 follow-up: ↓ 16 Changed 7 years ago by
Don't store weakrefs as values in TripleDict
and MonoDict
. That's what we have weakvalues=True
for. Or are you afraid the callback is too expensive to handle?
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 7 years ago by
Replying to nbruin:
Don't store weakrefs as values in
TripleDict
andMonoDict
. That's what we haveweakvalues=True
for.
Oops, I forgot about this.
Or are you afraid the callback is too expensive to handle?
No. CoercionModel_cache_maps._coercion_maps
stores tuples (namely: pairs of coercion maps, in both directions, hence, often (mor,None)
or (None,mor)
). We can't weakref the tuples. However, we could have a weakrefable extension type with two cdef attributes, so that we can store this instead.
comment:17 in reply to: ↑ 16 Changed 7 years ago by
Replying to SimonKing:
We can't weakref the tuples. However, we could have a weakrefable extension type with two cdef attributes, so that we can store this instead.
No, this won't work, because there would be no reference back to the pair, and thus it would be immediately garbage collected.
comment:18 follow-up: ↓ 19 Changed 7 years ago by
Found it: #14058 seems relevant. Especially read the sage-devel thread referenced there. We've thought quite a bit about this stuff before. Especially with the newly MUCH better lookup performance of TripleDict
and MonoDict
, we may be able to afford some more lookup indirection.
comment:19 in reply to: ↑ 18 Changed 7 years ago by
Replying to nbruin:
Found it: #14058 seems relevant. Especially read the sage-devel thread referenced there. We've thought quite a bit about this stuff before. Especially with the newly MUCH better lookup performance of
TripleDict
andMonoDict
, we may be able to afford some more lookup indirection.
OK, then let us see if it fixes the problem.
comment:20 Changed 7 years ago by
- Milestone changed from sage-5.13 to sage-duplicate/invalid/wontfix
- Reviewers set to Simon King
- Status changed from needs_work to positive_review
- Work issues Convince elliptic curves to not fail deleted
Yep, #14058 fixes the problem. Then I give a review to this ticket, saying that it is a duplicate and please be resolved as such.
comment:21 Changed 7 years ago by
- Branch u/SimonKing/ticket/15424 deleted
- Commit b052e5eef195a755a7eebeac931bd64d936348b8 deleted
- Resolution set to duplicate
- Status changed from positive_review to closed
I tried to see where the reference can be coming from:
and I found that with
x=K.one()*2
there is also a TripleDict that show up. It's fairly big (44 entries) and the entries all seem to be of the formD[domain,codomain,None]=morphism from domain to codomain
(orNone
entries). Since the morphism has a strong reference to the codomain, this would keep our ring alive [it doesn't seem like the kind of dictionary that can afford to be weak on its values]. Judging from the entries, this dictionary is a global one. I think there is a ticket somewhere that mentions the same phenomenon (also doing arithmetic), in the context of finite fields. Somewhere from the era when we started to work on #715 in earnest.