Opened 7 years ago

Closed 6 years ago

#12357 closed defect (duplicate)

Make groupoids garbage collectable

Reported by: SimonKing Owned by: rlm
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: memleak Keywords: groupoid cache Cernay2012
Cc: nthiery, jpflori Merged in:
Authors: Reviewers: Simon King, Jean-Pierre Flori
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #715, #11521, #12313, #11943 Stopgaps:

Description (last modified by jdemeyer)

Currently, the groupoid of an object P can not be garbage collected, even when deleting P.

Even worse: The persistence of Groupoid(P) also prevents P from being garbage collected.

The attached patch aims at solving it: An external reference to either P or Groupoid(P) is enough to keep both alive. But without an external reference, both P and Groupoid(P) become collectable.

Example from the docs:

            sage: P = GF(151)['x','y']
            sage: n = id(Groupoid(P))
            sage: import gc
            sage: _ = gc.collect()
            sage: n == id(Groupoid(P))   # indirect doctest
            True

        Thus, the groupoid is cached. But when deleting ``P``, its
        groupoid can be garbage collected as well::

            sage: del P
            sage: _ = gc.collect()
            sage: P = GF(151)['x','y']
            sage: n == id(Groupoid(P))
            False

        TESTS:

        We test against some corner cases::

            sage: Groupoid(None)
            Traceback (most recent call last):
            ...
            TypeError: Groupoid of None is not defined
            sage: Groupoid(1)
            Traceback (most recent call last):
            ...
            TypeError: 1 must either allow attribute assignment or be instances of <type 'sage.structure.parent.Parent'>
            sage: class A: pass
            sage: a = A()
            sage: Groupoid(a)
            Groupoid with underlying set <__main__.A instance at ...>
            sage: Groupoid(a) is Groupoid(a)
            True

Superseded by #12313.

Attachments (2)

trac12357_reviewer.patch (3.7 KB) - added by jpflori 7 years ago.
Reviewer patch
trac12357_internal_strong_groupoid_cache.patch (4.3 KB) - added by SimonKing 7 years ago.
Make groupoids garbage collectable, but still use a cache by strong references. Relative #11943

Download all attachments as: .zip

Change History (31)

comment:1 Changed 7 years ago by SimonKing

  • Status changed from new to needs_review

I did not run the full test suite yet. It could be that some silly old doctests use the groupoids in a way they are not intended, such as Groupoid(1). If this is the case, then my patch has to change. Otherwise: Needs review!

comment:2 Changed 7 years ago by SimonKing

To my surprise, there were no nasty old doctests. Hence, make ptest went well, with the patch from here and its dependencies!

comment:3 Changed 7 years ago by SimonKing

Concerning timings:

Without the patch, the groupoid of P is stored in a dictionary, indexed by P. Hence, access to the cache is slow if hash(P) is slow.

Some data points:

sage: P = GF(151)['x','y']
sage: %timeit G = Groupoid(P)  # slow hash
625 loops, best of 3: 20.9 µs per loop
sage: %timeit G = Groupoid(ZZ) # fast hash
625 loops, best of 3: 1.75 µs per loop
sage: class Bar(Parent): pass
....: 
sage: P = Bar()
sage: %timeit G = Groupoid(P)  # slow hash
625 loops, best of 3: 15.3 µs per loop

But with the patch, it is stored as an attribute of P. Hence, it is slow if attribute access is slow. The point is that even slow attribute access is faster than slow hash, and slow attribute access is only little slower than fast hash:

sage: P = GF(151)['x','y']
sage: %timeit G = Groupoid(P)  # slow attribute access
625 loops, best of 3: 3.11 µs per loop
sage: %timeit G = Groupoid(ZZ) # slow attribute access
625 loops, best of 3: 3.1 µs per loop
sage: class Bar(Parent): pass
....: 
sage: P = Bar()
sage: %timeit G = Groupoid(P)  # fast attribute access
625 loops, best of 3: 1.59 µs per loop

Hence, I believe that the patch is not only fixing a memory leak, but has the potential to generate a speed-up.

comment:4 Changed 7 years ago by jpflori

  • Keywords groupoid cache Cernay2012 added
  • Status changed from needs_review to needs_info
  • Work issues set to tests for None?

This is the simplest patch of the cache problems suite, and it does not depend on #715, so I begin with this one.

I'm a little bit confused by all those tests for None in your patch.

Isn't the first one sufficient?

More specifically, the second test "if S is not None" isn't superfluous ?

And if it is not None, can G be None ? (I guess there is some cython voodoo involved here)

To answer my own question, I guess it is none if S is not a Parent (as in your example with 1).

Once you answer me back, I'll post a reviewer patch with some additional minor corrections.

comment:5 Changed 7 years ago by SimonKing

  • Status changed from needs_info to needs_review

Yes, the test "if S is not None" in line 103 seems redundant, as S being None would result in an error being raised in line 92.

Concerning Cython voodoo: If S is not None and not of type Parent, then G == S in line 106 would result in a type error (since G is cdefined) - which is desired behaviour. But if S were None, then G == S in line 106 would not complain, and then line 109 would segfault. This is why there must be a test for the corner case - of course, one test would be enough.

comment:6 Changed 7 years ago by jpflori

  • Status changed from needs_review to positive_review
  • Work issues tests for None? deleted

Ok, I've posted a reviewer patch removing the second test and adding some comments on the code.

Everything is fine for me (as long as further modification in #715 and #12313 do not affect the ticket here).

So I'll put is as positive review now.

Changed 7 years ago by jpflori

Reviewer patch

comment:7 Changed 7 years ago by SimonKing

The reviewer patch looks fine to me. Thank you!

comment:8 Changed 7 years ago by jdemeyer

  • Dependencies changed from #12313 to #715, #12313

comment:9 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.0 to sage-pending

Moving this to sage-pending for now since it depends on two non-positively reviewed tickets.

comment:10 Changed 7 years ago by SimonKing

  • Dependencies changed from #715, #12313 to #715, #12313, #11943
  • Work issues set to rebase rel #11943

There is a conflict with #11943.

  • The ticket here is pending anyway
  • #11943 has only one dependency, that is already merged
  • #11943 is more fundamental than the patch here.

Therefore, I suggest to rebase my patch with respect to #11943.

comment:11 Changed 7 years ago by SimonKing

  • Description modified (diff)

Done!

Apply trac12357_internal_strong_groupoid_cache.patch trac12357_reviewer.patch

comment:12 follow-up: Changed 7 years ago by SimonKing

  • Status changed from positive_review to needs_work
  • Work issues rebase rel #11943 deleted

Something went wrong. Namely, originally, Jean-Pierre reviewed this patch, because it seemed independent of the other weak reference stuff. But meanwhile it depends on the other stuff.

Should one perhaps revert the dependencies? I.e., make it so that the patch here only depends on #11943, and then rebase #715 and #12313 relative to it?

Changed 7 years ago by SimonKing

Make groupoids garbage collectable, but still use a cache by strong references. Relative #11943

comment:13 Changed 7 years ago by SimonKing

I had a test that works like "Create an object and store its memory address; delete it, do garbage collection, create the object again and show that the memory address is different from the old".

Of course, that is very fragile. By chance, when applying a totally unrelated ticket, the test breaks for me.

Therefore I made a change in the first patch. The test now verifies that the stored memory address does not occur among the memory addresses of uncollectable objects, after garbage collection.

I did not change the dependencies, yet. But perhaps the reviewer has an opinion on that matter?

Apply trac12357_internal_strong_groupoid_cache.patch trac12357_reviewer.patch

comment:14 Changed 7 years ago by SimonKing

  • Status changed from needs_work to needs_review

comment:15 in reply to: ↑ 12 Changed 7 years ago by jpflori

Replying to SimonKing:

Something went wrong. Namely, originally, Jean-Pierre reviewed this patch, because it seemed independent of the other weak reference stuff. But meanwhile it depends on the other stuff.

You're right.

Should one perhaps revert the dependencies? I.e., make it so that the patch here only depends on #11943, and then rebase #715 and #12313 relative to it?

The correct behavior here depends on the application of the three previous tickets, am I right ? So I think its ok to have the three of them as dependencies (as there is no conflict to apply all of them).

comment:16 Changed 7 years ago by jpflori

  • Dependencies changed from #715, #12313, #11943 to #715, #11521, #12313, #11943
  • Reviewers set to Jean-Pierre Flori
  • Status changed from needs_review to positive_review

Apart from my above remark, everything is fine on my install of beta13 plus #715, #11521, #12313, #11943 and this ticket.

The modified test is ok too.

From my point of view #715 (with #11521) and #12313 are needed to make the ticket here work, but the converse is not true. And from a practical point of view, #11943 plus the ticket here can be applied before or after #715 and #12313 without conflicts.

So the dependencies look fine to me as they are now, I'm just adding #11521 because it now goes with #715. I may also have missed something.

Rebasing everything to make all of that going the other way around seems to me to be a waste of time.

comment:17 Changed 7 years ago by jpflori

I've put the ticket back to positive review because I feel that #715 and #12313 will get positive review as well.

comment:18 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-pending to sage-5.1

comment:19 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.1 to sage-pending

comment:20 Changed 7 years ago by jdemeyer

  • Status changed from positive_review to needs_work

Unfortunalely, this regularly (not always but quite often) gives Segmentation Faults in sage/schemes/elliptic_curves/gal_reps.py after the file has been tested:

$ ./sage -t --verbose "devel/sage/sage/schemes/elliptic_curves/gal_reps.py"
[...]
4 items had no tests:
    __main__
    __main__.change_warning_output
    __main__.check_with_tolerance
    __main__.warning_function
25 items passed all tests:
  26 tests in __main__.example_0
[...]
  10 tests in __main__.example_9
263 tests in 29 items.
263 passed and 0 failed.
Test passed.
Segmentation fault
         [21.5 s]

----------------------------------------------------------------------
The following tests failed:


        sage -t --verbose "devel/sage/sage/schemes/elliptic_curves/gal_reps.py"
Total time for all tests: 21.6 seconds

This is with sage-5.1.beta4 + #12357 (beta4 is not yet released, but essentially finished).

comment:21 Changed 6 years ago by SimonKing

Jeroen, does the segmentation fault persists, now that #715 seems fine due to Cython upgrade?

Anyway, I am now running make ptest with MALLOCK_CHECK_=3 on sage-5.6.rc0 debug version (#13864) plus #12215, #12313 and #12357.

comment:22 Changed 6 years ago by SimonKing

  • Status changed from needs_work to needs_review

Patchbot, apply trac12357_internal_strong_groupoid_cache.patch attachment:trac12357_reviewer.patch

comment:23 Changed 6 years ago by SimonKing

  • Status changed from needs_review to needs_work

I got one doctest error, which unfortunately shows that the patch does not solve the problem it was created for:

File "/home/simon/SAGE/debug/sage-5.6.rc0/devel/sage/sage/categories/groupoid.pyx", line 64:
    sage: n in map(id, gc.get_objects())
Expected:
    False
Got:
    True

Hence, some garbage that was supposed to get collected did not become collected. And there also seems to be a timeout in

sage -t -force_lib "devel/sage/doc/en/bordeaux_2008/birds_other.rst"

comment:24 Changed 6 years ago by SimonKing

In fact, the polynomial ring in the failing example does not become collected, even without creating its groupoid. I thought that this would have been fixed in one of the dependencies?

sage: P = GF(151)['x','y'] 
sage: import gc
sage: m = id(P)
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
True

comment:25 Changed 6 years ago by SimonKing

Perhaps I should use a test that avoids polynomial rings, but uses a custom UniqueRepresentation. Namely, #12215 makes UniqueRepresentation collectable, and the patch from here makes the groupoid collectable as well, namely:

sage: class MyParent(UniqueRepresentation, Parent):
....:     pass
....: 
sage: P = MyParent()
sage: import gc
sage: m = id(P)
sage: n = id(Groupoid(P)) 
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
False
sage: n in map(id, gc.get_objects()) 
False

What do you think?

By the way, the timeout is no problem: I had closed my laptop during the tests. Hence, the tests were interrupted, but the time was running.

comment:26 Changed 6 years ago by SimonKing

PS: In unpatched sage-5.6.rc0, one would obtain

sage: P = Parent()
sage: import gc
sage: m = id(P)
sage: n = id(Groupoid(P)) 
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True

And this is even without using a cache for P!!

Hence, the tests does demonstrate that some leak was fixed.

comment:27 Changed 6 years ago by SimonKing

Apart from changing the test, I'd like to change something more. Namely, it is stated that UniqueRepresentation is using cached_function - but that is wrong by #12215, which is a dependency of this ticket.

The main idea of storing the groupoid of P as an attribute of P is to avoid creating an external strong reference on the groupoid. But that would not be the case, when using a weak_cached_function for caching.

Hence, it could actually be that #12215 already fixes what is attempted here! I'll test that.

comment:28 Changed 6 years ago by SimonKing

  • Milestone changed from sage-pending to sage-duplicate/invalid/wontfix
  • Status changed from needs_work to positive_review

Indeed. With just #12215 and #12313 on top of sage-5.6.rc0, one obtains:

sage: P = Parent()
sage: m = id(P)
sage: n = id(Groupoid(P)) 
sage: import gc
sage: m in map(id, gc.get_objects()) 
True
sage: n in map(id, gc.get_objects()) 
True
sage: del P
sage: _ = gc.collect() 
sage: m in map(id, gc.get_objects()) 
False
sage: n in map(id, gc.get_objects()) 
False

Hence, I suggest to resolve this as a duplicate. Jean-Pierre, please change if you disagree.

comment:29 Changed 6 years ago by jdemeyer

  • Authors Simon King deleted
  • Description modified (diff)
  • Resolution set to duplicate
  • Reviewers changed from Jean-Pierre Flori to Simon King, Jean-Pierre Flori
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.