Opened 3 years ago

Closed 3 years ago

Last modified 2 years ago

#18157 closed defect (fixed)

Random failure in coerce_action.pyx

Reported by: vbraun Owned by:
Priority: major Milestone: sage-6.6
Component: coercion Keywords: random_fail
Cc: SimonKing Merged in:
Authors: Vincent Delecroix, Simon King Reviewers: Simon King
Report Upstream: N/A Work issues:
Branch: 60fcedc (Commits) Commit:
Dependencies: Stopgaps:

Description (last modified by vdelecroix)

Zmod(6) can be collected at the wrong time

sage -t --long --warn-long 47.9 src/sage/structure/coerce_actions.pyx
**********************************************************************
File "src/sage/structure/coerce_actions.pyx", line 78, in sage.structure.coerce_actions.LAction
Failed example:
    sage.structure.coerce_actions.GenericAction(QQ, Zmod(6), True, check=False)
Expected:
    Left action by Rational Field on Ring of integers modulo 6
Got:
    <repr(<sage.structure.coerce_actions.GenericAction at 0x7f8650aca938>) failed: RuntimeError: This action acted on a set that became garbage collected>
**********************************************************************
1 item had failures:
   1 of   4 in sage.structure.coerce_actions.LAction
    [100 tests, 1 failure, 0.82 s]

Idem with

sage -t --long src/sage/structure/coerce.pyx
**********************************************************************
File "src/sage/structure/coerce.pyx", line 1452, in sage.structure.coerce.CoercionModel_cache_maps.discover_action
Failed example:
    cm.discover_action(GF(5)['x'], ZZ, operator.div)
Expected:
    Right inverse action by Finite Field of size 5 on Univariate Polynomial Ring in x over Finite Field of size 5
    with precomposition on right by Natural morphism:
      From: Integer Ring
      To:   Finite Field of size 5
Got:
    <repr(<sage.categories.action.PrecomposedAction at 0x2b41eb2a2050>) failed: RuntimeError: This action acted on a set that became garbage collected>
**********************************************************************

Change History (24)

comment:1 Changed 3 years ago by nbruin

Reliable reproduction of the error condition:

import gc
A=sage.structure.coerce_actions.GenericAction(QQ, Zmod(6), True, check=False)
gc.collect()
C=repr(A)

The reason is simple: sage.categories.action.Action.__init__ only stores a weakref to the set acted on, and nothing else. If this "action" object is mainly meant to operate inside the coercion framework then this is probably the appropriate behaviour. Note that if the action ever needs to get used, then there is a group element to act and a set element to be acted upon, so the space has to be alive. Whoever is storing this action object should have a callback on the weakref, though, to throw away this action object to avoid a memory leak.

It may just be that this action object does get used properly and that this doctest is broken. In that case the fix is:

sage: Z6 = Zmod(6)
sage: sage.structure.coerce_actions.GenericAction(QQ, Z6, True, check=False)
Last edited 3 years ago by nbruin (previous) (diff)

comment:2 Changed 3 years ago by vdelecroix

  • Description modified (diff)

comment:3 follow-up: Changed 3 years ago by nbruin

cm.discover_action(GF(5)['x'], ZZ, operator.div)

Also for this one: the coercion framework will never be asked to discover an action on an otherwise unreferenced parent. At the same time, the action object does get cached (globally!) somewhere in the coercion framework and hence should probably not hold a strong reference (at least to one of the sets it acts on). The container it gets stored in is probably a TripleDict, keyed on the two parents involved. It takes care of having a callback to delete the action object when one of the domains gets deleted, so the action objects will normally not outlive the shortest lifetime of the parents involved.

If we have an absolute need of handling actions outside the coercion framework, we should probably make other action objects that do keep strong references. We do this for maps that get used for conversion and coercion. But these actions are showing desirable behaviour for use in the coercion framework. I think the bug is in the doctests.

comment:4 in reply to: ↑ 3 Changed 3 years ago by vdelecroix

Replying to nbruin:

I think the bug is in the doctests.

Me too!

comment:5 Changed 3 years ago by vbraun

Somebody please write a patch instead of me-too-ing ;-)

comment:6 Changed 3 years ago by vdelecroix

  • Branch set to public/18157
  • Commit set to ced7bfcc760e2fb362b2a382a22a94e02f996f9b
  • Status changed from new to needs_review

Might be other wrong doctests... not sure how to find them.


New commits:

ced7bfcTrac 18157: fix garbage collected objects in doctest

comment:7 Changed 3 years ago by pbruin

There is another one in src/sage/matrix/action.pyx; it is already fixed in #10513, so no need to treat that one here.

comment:8 follow-up: Changed 3 years ago by vdelecroix

Would be nice if people would run the testsuite with this patch (and #10513)

for i in $(seq -w 1 100)
do
   sage -t --long structure/ rings/ modular/ matrix/| tee ~/testlong${i}.log
done

Or even better, does somebody know how to call gc.collect() after each command in doctests?

comment:9 in reply to: ↑ 8 ; follow-up: Changed 3 years ago by jdemeyer

Replying to vdelecroix:

Or even better, does somebody know how to call gc.collect() after each command in doctests?

That wouldn't help, right?

comment:10 in reply to: ↑ 9 Changed 3 years ago by vdelecroix

Replying to jdemeyer:

Replying to vdelecroix:

Or even better, does somebody know how to call gc.collect() after each command in doctests?

That wouldn't help, right?

Right! Not for the two doctests mentioned in the description...

comment:11 Changed 3 years ago by SimonKing

  • Cc SimonKing added

comment:12 follow-up: Changed 3 years ago by SimonKing

  • Reviewers set to Simon King
  • Status changed from needs_review to needs_work
  • Work issues set to Find underlying problem for elliptic curve problem, or open a new ticket for it

With one exception, the flaky tests constitute a misuse: Objects that are normally used internally are suddenly put out into the wild. It is clear that that can lead to disaster unless precautions are taken. Hence, I support the general approach in the attached branch: In these tests, one should work with permanent rather than temporary objects, to prevent premature garbage collection.

However, one should also explicitly say in these tests that we are talking about things that normally happen in the innards of Sage, and that the tests do not show how to work with them!

One test is different:

@@ -606,7 +624,8 @@ cdef class IntegerMulAction(Action):
 
         This used to hang before :trac:`17844`::
 
-            sage: E = EllipticCurve(GF(5), [4,0])
+            sage: GF5 = GF(5)
+            sage: E = EllipticCurve(GF5, [4,0])
             sage: P = E.random_element()
             sage: (-2^63)*P
             (0 : 1 : 0)

Defining an elliptic curve over a finite field should certainly be enough to prevent the finite field from garbage collection! So, here I do not agree with the fix. Apparently the test fails because of an internal bug, not because of misuse.

So, I put it to "needs work", because we should either find and fix the underlying bug here, or open a new ticket for the elliptic curve issue.

comment:13 Changed 3 years ago by SimonKing

I don't see a problem with the elliptic curve example:

sage: import gc
sage: E = EllipticCurve(GF(5), [4,0])
sage: gc.collect()
156
sage: P = E.random_element()
sage: gc.collect()
0
sage: (-2^63)*P
(0 : 1 : 0)

So, why should it be changed?

comment:14 follow-up: Changed 3 years ago by vbraun

Even if right now E is kept alive elsewhere (presumably another doctest leaked it) you shouldn't assume that this will not change in the future.

comment:15 in reply to: ↑ 12 Changed 3 years ago by vdelecroix

Replying to SimonKing:

One test is different:

Agreed. You can revert it.

comment:16 in reply to: ↑ 14 Changed 3 years ago by SimonKing

Replying to vbraun:

Even if right now E is kept alive elsewhere (presumably another doctest leaked it) you shouldn't assume that this will not change in the future.

What are you talking about? In the test in question, the elliptic curve E certainly is kept alive in the test (not elsewhere), as it is strongly referenced. As I don't see this test fail, I agree with Vincent that the test should be reverted to its original state.

comment:17 Changed 3 years ago by git

  • Commit changed from ced7bfcc760e2fb362b2a382a22a94e02f996f9b to ef3bb5d985ee65e46d89277a3833af40953bc59c

Branch pushed to git repo; I updated commit sha1. New commits:

ef3bb5dAdd a warning not to use coerce actions outside of the coercion model

comment:18 Changed 3 years ago by SimonKing

  • Authors set to Vincent Delecroix, SImon King
  • Status changed from needs_work to needs_review

I added comments in appropriate places, telling to be careful when using coerce actions outside of coercion models, reverting the elliptic curve test, and fixing some more tests in which a premature garbage collection was possible.

From my point of view, it is a positive review, but of course someone should have a look at my commit.


New commits:

ef3bb5dAdd a warning not to use coerce actions outside of the coercion model

comment:19 Changed 3 years ago by git

  • Commit changed from ef3bb5d985ee65e46d89277a3833af40953bc59c to 60fcedce48b8237353e72315976ed24ab7c236ca

Branch pushed to git repo; I updated commit sha1. New commits:

a48cd06Merge branch 'public/18157' of trac.sagemath.org:sage into develop
60fcedcTrac 18157: adding reference to the trac ticket

comment:20 follow-up: Changed 3 years ago by vdelecroix

Hi Simon,

I merge on sage-6.7.beta1 (sorry if I did wrong). If you prefer you can only cherry-pick 60fcedc.

I agree with your commits and just added references to your ticket. I am ok for the positive review.

Vincent

comment:21 in reply to: ↑ 20 Changed 3 years ago by SimonKing

Replying to vdelecroix:

I merge on sage-6.7.beta1 (sorry if I did wrong). If you prefer you can only cherry-pick 60fcedc.

I suppose that Volker will tell you that (1) one should only merge if there is a good reason, and (2) one must not change the history, even if one has merged without a good reason.

So, just leave it as it is. I had a look at your changes, and they look fine to me. Provided that the patchbot gives a green blob, it should be positive review.

comment:22 Changed 3 years ago by SimonKing

  • Status changed from needs_review to positive_review
  • Work issues Find underlying problem for elliptic curve problem, or open a new ticket for it deleted

The patchbot tells us that all tests pass, and based on the discussion above we have a positive review.

comment:23 Changed 3 years ago by vbraun

  • Branch changed from public/18157 to 60fcedce48b8237353e72315976ed24ab7c236ca
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:24 Changed 2 years ago by kcrisman

  • Authors changed from Vincent Delecroix, SImon King to Vincent Delecroix, Simon King
  • Commit 60fcedce48b8237353e72315976ed24ab7c236ca deleted
Note: See TracTickets for help on using tickets.