Opened 7 years ago
Closed 6 years ago
#12357 closed defect (duplicate)
Make groupoids garbage collectable
Reported by: | SimonKing | Owned by: | rlm |
---|---|---|---|
Priority: | major | Milestone: | sage-duplicate/invalid/wontfix |
Component: | memleak | Keywords: | groupoid cache Cernay2012 |
Cc: | nthiery, jpflori | Merged in: | |
Authors: | Reviewers: | Simon King, Jean-Pierre Flori | |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #715, #11521, #12313, #11943 | Stopgaps: |
Description (last modified by )
Currently, the groupoid of an object P can not be garbage collected, even when deleting P.
Even worse: The persistence of Groupoid(P)
also prevents P from being garbage collected.
The attached patch aims at solving it: An external reference to either P or Groupoid(P)
is enough to keep both alive. But without an external reference, both P and Groupoid(P)
become collectable.
Example from the docs:
sage: P = GF(151)['x','y'] sage: n = id(Groupoid(P)) sage: import gc sage: _ = gc.collect() sage: n == id(Groupoid(P)) # indirect doctest True Thus, the groupoid is cached. But when deleting ``P``, its groupoid can be garbage collected as well:: sage: del P sage: _ = gc.collect() sage: P = GF(151)['x','y'] sage: n == id(Groupoid(P)) False TESTS: We test against some corner cases:: sage: Groupoid(None) Traceback (most recent call last): ... TypeError: Groupoid of None is not defined sage: Groupoid(1) Traceback (most recent call last): ... TypeError: 1 must either allow attribute assignment or be instances of <type 'sage.structure.parent.Parent'> sage: class A: pass sage: a = A() sage: Groupoid(a) Groupoid with underlying set <__main__.A instance at ...> sage: Groupoid(a) is Groupoid(a) True
Superseded by #12313.
Attachments (2)
Change History (31)
comment:1 Changed 7 years ago by
- Status changed from new to needs_review
comment:2 Changed 7 years ago by
To my surprise, there were no nasty old doctests. Hence, make ptest
went well, with the patch from here and its dependencies!
comment:3 Changed 7 years ago by
Concerning timings:
Without the patch, the groupoid of P is stored in a dictionary, indexed by P. Hence, access to the cache is slow if hash(P)
is slow.
Some data points:
sage: P = GF(151)['x','y'] sage: %timeit G = Groupoid(P) # slow hash 625 loops, best of 3: 20.9 µs per loop sage: %timeit G = Groupoid(ZZ) # fast hash 625 loops, best of 3: 1.75 µs per loop sage: class Bar(Parent): pass ....: sage: P = Bar() sage: %timeit G = Groupoid(P) # slow hash 625 loops, best of 3: 15.3 µs per loop
But with the patch, it is stored as an attribute of P. Hence, it is slow if attribute access is slow. The point is that even slow attribute access is faster than slow hash, and slow attribute access is only little slower than fast hash:
sage: P = GF(151)['x','y'] sage: %timeit G = Groupoid(P) # slow attribute access 625 loops, best of 3: 3.11 µs per loop sage: %timeit G = Groupoid(ZZ) # slow attribute access 625 loops, best of 3: 3.1 µs per loop sage: class Bar(Parent): pass ....: sage: P = Bar() sage: %timeit G = Groupoid(P) # fast attribute access 625 loops, best of 3: 1.59 µs per loop
Hence, I believe that the patch is not only fixing a memory leak, but has the potential to generate a speed-up.
comment:4 Changed 7 years ago by
- Keywords groupoid cache Cernay2012 added
- Status changed from needs_review to needs_info
- Work issues set to tests for None?
This is the simplest patch of the cache problems suite, and it does not depend on #715, so I begin with this one.
I'm a little bit confused by all those tests for None in your patch.
Isn't the first one sufficient?
More specifically, the second test "if S is not None" isn't superfluous ?
And if it is not None, can G be None ? (I guess there is some cython voodoo involved here)
To answer my own question, I guess it is none if S is not a Parent (as in your example with 1).
Once you answer me back, I'll post a reviewer patch with some additional minor corrections.
comment:5 Changed 7 years ago by
- Status changed from needs_info to needs_review
Yes, the test "if S is not None" in line 103 seems redundant, as S being None would result in an error being raised in line 92.
Concerning Cython voodoo: If S is not None and not of type Parent, then G == S
in line 106 would result in a type error (since G is cdefined) - which is desired behaviour. But if S were None, then G == S
in line 106 would not complain, and then line 109 would segfault. This is why there must be a test for the corner case - of course, one test would be enough.
comment:6 Changed 7 years ago by
- Status changed from needs_review to positive_review
- Work issues tests for None? deleted
comment:7 Changed 7 years ago by
The reviewer patch looks fine to me. Thank you!
comment:8 Changed 7 years ago by
- Dependencies changed from #12313 to #715, #12313
comment:9 Changed 7 years ago by
- Milestone changed from sage-5.0 to sage-pending
Moving this to sage-pending for now since it depends on two non-positively reviewed tickets.
comment:10 Changed 7 years ago by
- Dependencies changed from #715, #12313 to #715, #12313, #11943
- Work issues set to rebase rel #11943
comment:11 Changed 7 years ago by
- Description modified (diff)
Done!
Apply trac12357_internal_strong_groupoid_cache.patch trac12357_reviewer.patch
comment:12 follow-up: ↓ 15 Changed 7 years ago by
- Status changed from positive_review to needs_work
- Work issues rebase rel #11943 deleted
Something went wrong. Namely, originally, Jean-Pierre reviewed this patch, because it seemed independent of the other weak reference stuff. But meanwhile it depends on the other stuff.
Should one perhaps revert the dependencies? I.e., make it so that the patch here only depends on #11943, and then rebase #715 and #12313 relative to it?
Changed 7 years ago by
Make groupoids garbage collectable, but still use a cache by strong references. Relative #11943
comment:13 Changed 7 years ago by
I had a test that works like "Create an object and store its memory address; delete it, do garbage collection, create the object again and show that the memory address is different from the old".
Of course, that is very fragile. By chance, when applying a totally unrelated ticket, the test breaks for me.
Therefore I made a change in the first patch. The test now verifies that the stored memory address does not occur among the memory addresses of uncollectable objects, after garbage collection.
I did not change the dependencies, yet. But perhaps the reviewer has an opinion on that matter?
Apply trac12357_internal_strong_groupoid_cache.patch trac12357_reviewer.patch
comment:14 Changed 7 years ago by
- Status changed from needs_work to needs_review
comment:15 in reply to: ↑ 12 Changed 7 years ago by
Replying to SimonKing:
Something went wrong. Namely, originally, Jean-Pierre reviewed this patch, because it seemed independent of the other weak reference stuff. But meanwhile it depends on the other stuff.
You're right.
Should one perhaps revert the dependencies? I.e., make it so that the patch here only depends on #11943, and then rebase #715 and #12313 relative to it?
The correct behavior here depends on the application of the three previous tickets, am I right ? So I think its ok to have the three of them as dependencies (as there is no conflict to apply all of them).
comment:16 Changed 7 years ago by
- Dependencies changed from #715, #12313, #11943 to #715, #11521, #12313, #11943
- Reviewers set to Jean-Pierre Flori
- Status changed from needs_review to positive_review
Apart from my above remark, everything is fine on my install of beta13 plus #715, #11521, #12313, #11943 and this ticket.
The modified test is ok too.
From my point of view #715 (with #11521) and #12313 are needed to make the ticket here work, but the converse is not true. And from a practical point of view, #11943 plus the ticket here can be applied before or after #715 and #12313 without conflicts.
So the dependencies look fine to me as they are now, I'm just adding #11521 because it now goes with #715. I may also have missed something.
Rebasing everything to make all of that going the other way around seems to me to be a waste of time.
comment:17 Changed 7 years ago by
comment:18 Changed 7 years ago by
- Milestone changed from sage-pending to sage-5.1
comment:19 Changed 7 years ago by
- Milestone changed from sage-5.1 to sage-pending
comment:20 Changed 7 years ago by
- Status changed from positive_review to needs_work
Unfortunalely, this regularly (not always but quite often) gives Segmentation Faults in sage/schemes/elliptic_curves/gal_reps.py
after the file has been tested:
$ ./sage -t --verbose "devel/sage/sage/schemes/elliptic_curves/gal_reps.py" [...] 4 items had no tests: __main__ __main__.change_warning_output __main__.check_with_tolerance __main__.warning_function 25 items passed all tests: 26 tests in __main__.example_0 [...] 10 tests in __main__.example_9 263 tests in 29 items. 263 passed and 0 failed. Test passed. Segmentation fault [21.5 s] ---------------------------------------------------------------------- The following tests failed: sage -t --verbose "devel/sage/sage/schemes/elliptic_curves/gal_reps.py" Total time for all tests: 21.6 seconds
This is with sage-5.1.beta4 + #12357 (beta4 is not yet released, but essentially finished).
comment:21 Changed 6 years ago by
comment:22 Changed 6 years ago by
- Status changed from needs_work to needs_review
Patchbot, apply trac12357_internal_strong_groupoid_cache.patch attachment:trac12357_reviewer.patch
comment:23 Changed 6 years ago by
- Status changed from needs_review to needs_work
I got one doctest error, which unfortunately shows that the patch does not solve the problem it was created for:
File "/home/simon/SAGE/debug/sage-5.6.rc0/devel/sage/sage/categories/groupoid.pyx", line 64: sage: n in map(id, gc.get_objects()) Expected: False Got: True
Hence, some garbage that was supposed to get collected did not become collected. And there also seems to be a timeout in
sage -t -force_lib "devel/sage/doc/en/bordeaux_2008/birds_other.rst"
comment:24 Changed 6 years ago by
In fact, the polynomial ring in the failing example does not become collected, even without creating its groupoid. I thought that this would have been fixed in one of the dependencies?
sage: P = GF(151)['x','y'] sage: import gc sage: m = id(P) sage: del P sage: _ = gc.collect() sage: m in map(id, gc.get_objects()) True
comment:25 Changed 6 years ago by
Perhaps I should use a test that avoids polynomial rings, but uses a custom UniqueRepresentation
. Namely, #12215 makes UniqueRepresentation
collectable, and the patch from here makes the groupoid collectable as well, namely:
sage: class MyParent(UniqueRepresentation, Parent): ....: pass ....: sage: P = MyParent() sage: import gc sage: m = id(P) sage: n = id(Groupoid(P)) sage: m in map(id, gc.get_objects()) True sage: n in map(id, gc.get_objects()) True sage: del P sage: _ = gc.collect() sage: m in map(id, gc.get_objects()) False sage: n in map(id, gc.get_objects()) False
What do you think?
By the way, the timeout is no problem: I had closed my laptop during the tests. Hence, the tests were interrupted, but the time was running.
comment:26 Changed 6 years ago by
PS: In unpatched sage-5.6.rc0, one would obtain
sage: P = Parent() sage: import gc sage: m = id(P) sage: n = id(Groupoid(P)) sage: m in map(id, gc.get_objects()) True sage: n in map(id, gc.get_objects()) True sage: del P sage: _ = gc.collect() sage: m in map(id, gc.get_objects()) True sage: n in map(id, gc.get_objects()) True
And this is even without using a cache for P!!
Hence, the tests does demonstrate that some leak was fixed.
comment:27 Changed 6 years ago by
Apart from changing the test, I'd like to change something more. Namely, it is stated that UniqueRepresentation
is using cached_function - but that is wrong by #12215, which is a dependency of this ticket.
The main idea of storing the groupoid of P as an attribute of P is to avoid creating an external strong reference on the groupoid. But that would not be the case, when using a weak_cached_function for caching.
Hence, it could actually be that #12215 already fixes what is attempted here! I'll test that.
comment:28 Changed 6 years ago by
- Milestone changed from sage-pending to sage-duplicate/invalid/wontfix
- Status changed from needs_work to positive_review
Indeed. With just #12215 and #12313 on top of sage-5.6.rc0, one obtains:
sage: P = Parent() sage: m = id(P) sage: n = id(Groupoid(P)) sage: import gc sage: m in map(id, gc.get_objects()) True sage: n in map(id, gc.get_objects()) True sage: del P sage: _ = gc.collect() sage: m in map(id, gc.get_objects()) False sage: n in map(id, gc.get_objects()) False
Hence, I suggest to resolve this as a duplicate. Jean-Pierre, please change if you disagree.
comment:29 Changed 6 years ago by
- Description modified (diff)
- Resolution set to duplicate
- Reviewers changed from Jean-Pierre Flori to Simon King, Jean-Pierre Flori
- Status changed from positive_review to closed
I did not run the full test suite yet. It could be that some silly old doctests use the groupoids in a way they are not intended, such as
Groupoid(1)
. If this is the case, then my patch has to change. Otherwise: Needs review!