Opened 4 years ago

Closed 4 years ago

#15432 closed enhancement (fixed)

Use a callback with a weak reference to WeakValueDictionary

Reported by: nbruin Owned by:
Priority: major Milestone: sage-6.1
Component: memleak Keywords:
Cc: SimonKing Merged in:
Authors: Nils Bruin Reviewers: Simon King
Report Upstream: N/A Work issues:
Branch: u/SimonKing/ticket/15432 (Commits) Commit: 13112a9a5b664bcbd467cc6c9b197ae809ab44fb
Dependencies: Stopgaps:

Description

Following http://bugs.python.org/issue417795, we should probably only reference the dict weakly from the callback object to avoid unnecessary circular references.

As we found in http://trac.sagemath.org/ticket/15367#comment:39, if an object C involved in a reference cycle becomes unreachable as a side effect of circular GC (e.g., a callback somewhere), the object C will only be collected by GC in the next round. A non-circularly referenced object C would have its reference count hit 0 and would be collected immediately. That is already a good reason to try and avoid circular references, even now that Python does clean them up eventually.

See also http://trac.sagemath.org/ticket/13394#comment:19 (reasoning the validity of using a strongly referencing closure should be fine in theory -- this didn't take into account the practical consideration above).

Change History (28)

comment:1 Changed 4 years ago by nbruin

  • Branch set to u/nbruin/ticket/15432
  • Created changed from 11/17/13 20:07:53 to 11/17/13 20:07:53
  • Modified changed from 11/17/13 20:07:53 to 11/17/13 20:07:53

comment:2 Changed 4 years ago by nbruin

  • Commit set to 9bb9241572468d4bc0563f06fe6fffcc8b12c345
  • Status changed from new to needs_review

New commits:

9bb9241trac #15432: use a callback class object holding a weak reference for sage.misc.weak_dict.WeakValueDictionary?

comment:3 follow-up: Changed 4 years ago by SimonKing

Just to make sure: Is there a dependency for this ticket? It seems that we need #13394 (which is merged). But #15367 is orthogonal, isn't it?

comment:4 in reply to: ↑ 3 Changed 4 years ago by nbruin

Replying to SimonKing:

Just to make sure: Is there a dependency for this ticket? It seems that we need #13394 (which is merged). But #15367 is orthogonal, isn't it?

The branch here descends directly from master, which recently had 5.13beta3 merged. Therefore, as posted, this ticket has no dependencies. Since it only touches one file, that isn't even present in the branch for #15367 (its branch split off before 5.13beta3 merger), I expect no merge problems with that ticket.

I actually have a bit of trouble coming up with an example that shows a difference in behaviour for WeakValueDictionary (before this ticket it's not weakly refererrable, making examples particularly tricky). With the following code:

%cpaste
import gc
from sage.misc.weak_dict import WeakValueDictionary
from sage.structure.coerce_dict import MonoDict
def Nobj(t):
    return len([1 for o in gc.get_objects() if isinstance(o,t)])
def TestTcollect(T,args=()):
    N=Nobj(T)
    print "Number of T-objects before we start:",N
    L=[T(*args) for i in range(100)]
    M=MonoDict(11)
    for i in range(len(L)-1):
        M[L[i]]=L[i+1]
    Nnew=Nobj(T)
    print "Number of T-objects after list construction:",Nnew
    print "Throwing away references:"
    del L
    Nnew=Nobj(T)
    if Nnew == N:
        print "no GC required to get rid of objects";
    j=0
    while Nnew>N:
        j+=1
        _=gc.collect()
        Ntemp=Nobj(T)
        if Ntemp >= Nnew:
            raise RuntimeError("we're not collecting at all!")
        Nnew=Ntemp
    print "number of collects required:",j
class dd(dict):
    "a weakly reffable and easily recognized dict class"
    pass
class selfref(object):
    def __init__(self):
        self.D=self
--

I'm getting with a self-referencing class:

sage: TestTcollect(selfref)
Number of T-objects before we start: 0
Number of T-objects after list construction: 100
Throwing away references:
number of collects required: 100

The test doesn't work for dict itself (they have their own freelist etc. so the object count you get back can't be trusted), but we can do it on a subclass:

sage: TestTcollect(dd)
Number of T-objects before we start: 0
Number of T-objects after list construction: 100
Throwing away references:
no GC required to get rid of objects
number of collects required: 0

For this patched WeakValueDictionary we're still getting problematic behaviour:

sage: TestTcollect(sage.misc.weak_dict.WeakValueDictionary)
Number of T-objects before we start: 29
Number of T-objects after list construction: 129
Throwing away references:
number of collects required: 100

Whereas python's own is clean:

sage: TestTcollect(weakref.WeakValueDictionary)
Number of T-objects before we start: 3
Number of T-objects after list construction: 103
Throwing away references:
no GC required to get rid of objects
number of collects required: 0

As is (at least our old) TripleDict:

sage: TestTcollect(sage.structure.coerce_dict.TripleDict,(11,))
Number of T-objects before we start: 180
Number of T-objects after list construction: 280
Throwing away references:
no GC required to get rid of objects
number of collects required: 0

So my guess is that we've overlooked another self reference in WeakValueDictionary.

Note that our previous sage.misc.weak_dict.WeakValueDictionary wasn't weakly referrable so this test doesn't apply in that case.

comment:5 Changed 4 years ago by git

  • Commit changed from 9bb9241572468d4bc0563f06fe6fffcc8b12c345 to 8b70d1d8f68fdfe6ae020c37a43e91031c6fe952

Branch pushed to git repo; I updated commit sha1. New commits:

8b70d1dtrac #15432: also equip iteration context with a weak reference

comment:6 Changed 4 years ago by nbruin

And of course there is another circular reference from the _IterationContext. Making that weak as well makes the thing collectable, so with the new branch we get that TestTcollect works as desired.

Note that if you have objects chained by keys in a WeakValueDictionary, immediate collectibility of the chain could depend on the WeakValueDictionary? not keeping circular references to itself.

So, I think this patch could have real effect of memory footprint in sage code.

comment:7 Changed 4 years ago by git

  • Commit changed from 8b70d1d8f68fdfe6ae020c37a43e91031c6fe952 to 4ac686619284500adaf201c397a82a9b7409e41f

Branch pushed to git repo; I updated commit sha1. New commits:

4ac6866trac #15432: Code context manager in iterator via try/finally. This saves an entire class and some reference management.Z

comment:8 Changed 4 years ago by nbruin

Incidentally, I think we can do away with the _IterationContext by using a try/finally:

cython("""
def gen():
    try:
        while True:
            yield 1
    finally:
        print "finally clause executed"
""")
sage: G=gen()
sage: G.next()
1
sage: del G
finally clause executed
sage: G=gen()
sage: del G  #code is never entered, so `try/finally` clause is also not encountered

So instead of using a with context manager, I think we can get away with writing our iterators as

        cdef PyObject *key, *wr
        cdef Py_ssize_t pos = 0
        try:
            self._enter_iter()
            while PyDict_Next(self, &pos, &key, &wr):
                #this check doesn't really say anything: by the time
                #the key makes it to the customer, it may have already turned
                #invalid. It's a cheap check, though.
                if PyWeakref_GetObject(wr)!=Py_None:
                    yield <object>key
        finally:
            self._exit_iter()

where _enter_iter and _exit_iter can be the __enter__ and __exit__ methods on _IterationContext, adapted to be methods of WeakValueDictionary immediately. That saves us a weakref and an extra class (see pushed commit).


New commits:

4ac6866trac #15432: Code context manager in iterator via try/finally. This saves an entire class and some reference management.Z

comment:9 Changed 4 years ago by vbraun_spam

  • Milestone changed from sage-6.0 to sage-6.1

comment:10 Changed 4 years ago by vbraun

  • Authors set to Nils Bruin

comment:11 Changed 4 years ago by SimonKing

The code looks good, and using a cdef method instead of a Python context is very likely to be faster (I am not going to benchmark, though). I am running the tests now and will give a positive review if stuff passes.

comment:12 Changed 4 years ago by SimonKing

Did I do something wrong? With the branch, I still get

sage: TestTcollect(sage.misc.weak_dict.WeakValueDictionary)
Number of T-objects before we start: 29
Number of T-objects after list construction: 129
Throwing away references:
number of collects required: 100

comment:13 follow-up: Changed 4 years ago by SimonKing

Question: What exactly is the purpose of this ticket? Is the TestTcollect example supposed to be fixed (in the sense of "number of collects required: 0")?

comment:14 in reply to: ↑ 13 ; follow-up: Changed 4 years ago by nbruin

Replying to SimonKing:

Question: What exactly is the purpose of this ticket? Is the TestTcollect example supposed to be fixed (in the sense of "number of collects required: 0")?

Correct.

comment:15 in reply to: ↑ 14 Changed 4 years ago by SimonKing

Replying to nbruin:

Replying to SimonKing:

Question: What exactly is the purpose of this ticket? Is the TestTcollect example supposed to be fixed (in the sense of "number of collects required: 0")?

Correct.

Really? But it doesn't, see comment:12.

comment:16 Changed 4 years ago by SimonKing

Strange. For whatever reason, my local version of your branch has been commit 9bb9241, not 4ac686. So, I am trying sage --dev pull --ticket=15432 now, in the hope that it works.

comment:17 Changed 4 years ago by SimonKing

sage --dev checkout --ticket=15432 keeps giving me the wrong commit, by the way. What shall I do? Is pull the right thing to do here?

comment:18 Changed 4 years ago by SimonKing

Anyway, the sage --dev pull was automatically merging your branch into my local version of the "develop" branch, and I think this is what I wanted to end up with. Now I can continue with the review.

comment:19 Changed 4 years ago by vbraun

Pull pulls (=fetches & merges) into the current branch, whatever that currently is.

comment:20 Changed 4 years ago by vbraun

sage --dev checkout --ticket=15432 works for me, for the record.

comment:21 Changed 4 years ago by SimonKing

  • Reviewers set to Simon King

Hooray! After pulling the correct commit, I get

sage: TestTcollect(weakref.WeakValueDictionary)
Number of T-objects before we start: 3
Number of T-objects after list construction: 103
Throwing away references:
no GC required to get rid of objects
number of collects required: 0

and the answer came a lot faster than before. All tests pass, and the patch (i.e., the diff between the develop branch and this branch after merging with the develop branch) looks reasonable to me.

Hence, I can almost give it a positive review. How can I do a review patch? Is it possible to make some changes and then do sage --dev push, or what is to do for the equivalent of a review patch?

comment:22 Changed 4 years ago by SimonKing

PS: Shall I revert the merge with the develop branch before creating the review "patch"?

comment:23 Changed 4 years ago by vbraun

Yes to both.

  • If you made an unnecessary merge, revert it first (git reset --hard HEAD~)
  • for your review patch you just add another commit and then replace the branch (either manually with git or with sage -dev push)

comment:24 Changed 4 years ago by SimonKing

How can I do the job without the dev script? I.e., how to do it manually with git?

comment:25 follow-up: Changed 4 years ago by vbraun

  • edit
  • git add <filename>
  • git commit -m 'commit message'
  • git push trac HEAD:u/SimonKing/weak_callback

See also the developer guide (in Sage, not the old one that is still on www.sagemath.org)

comment:26 in reply to: ↑ 25 Changed 4 years ago by SimonKing

Replying to vbraun:

  • edit
  • git add <filename>
  • git commit -m 'commit message'
  • git push trac HEAD:u/SimonKing/weak_callback

See also the developer guide (in Sage, not the old one that is still on www.sagemath.org)

Thank you! I already found the guide (but I did git push --set-upstream trac HEAD:u/SimonKing/ticket/15432, hope this is fine, too). It takes rather long to complete, though.

comment:27 Changed 4 years ago by SimonKing

  • Branch changed from u/nbruin/ticket/15432 to u/SimonKing/ticket/15432
  • Commit changed from 4ac686619284500adaf201c397a82a9b7409e41f to 13112a9a5b664bcbd467cc6c9b197ae809ab44fb
  • Status changed from needs_review to positive_review

comment:28 Changed 4 years ago by vbraun

  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.