Sage: Ticket #13370: Do not cache the result of is_Field externally
https://trac.sagemath.org/ticket/13370
<p>
In <a class="closed ticket" href="https://trac.sagemath.org/ticket/11900" title="defect: Serious regression caused by #9138 (closed: fixed)">#11900</a>, a cache has been introduced to sage.rings.ring.is_Field, in order to speed things up. Unfortunately, that caused the following leak:
</p>
<pre class="wiki">sage: p = 1
sage: import gc
sage: gc.collect()
662
sage: for i in range(100):
....: p = next_prime(p)
....: R = ZZ.quotient(p)
....: t = R in Fields()
....: _ = gc.collect()
....: print len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)]),
....:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
</pre><p>
I think the cleanest solution is:
</p>
<ul><li>deprecate is_Field
</li><li>instead, use an uncached function _is_Field
</li><li>in the sage tree, replace <code>is_Field(R)</code> by <code>R in Fields()</code>
</li></ul><p>
Rationale:
</p>
<p>
is_Field(R) (or _is_Field(R)) will change R.category() into a sub-category of Fields(). But then, the next call to "R in Fields()" will be very fast. Hence, there is no reason (speed-wise) to cache is_Field. Actually, <code>R in Fields()</code> should be faster than <code>is_Field(R)</code> anyway, because of the cythonisation.
</p>
en-usSagehttps://trac.sagemath.org/chrome/site/logo_sagemath_trac.png
https://trac.sagemath.org/ticket/13370
Trac 1.1.6SimonKingWed, 15 Aug 2012 12:37:40 GMT
https://trac.sagemath.org/ticket/13370#comment:1
https://trac.sagemath.org/ticket/13370#comment:1
<p>
<strong><span class="underline">Timings</span></strong>
</p>
<p>
I made some benchmarks, based on most tests from <a class="closed ticket" href="https://trac.sagemath.org/ticket/11900" title="defect: Serious regression caused by #9138 (closed: fixed)">#11900</a> and one new test. This is on my laptop, hence the timings differ from the ones that I stated on <a class="closed ticket" href="https://trac.sagemath.org/ticket/11900" title="defect: Serious regression caused by #9138 (closed: fixed)">#11900</a>.
</p>
<p>
sage-5.2.rc0 plus the memleak patches <a class="closed ticket" href="https://trac.sagemath.org/ticket/12969" title="defect: Coercion failures in symmetric functions (closed: fixed)">#12969</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12215" title="defect: Memleak in UniqueRepresentation, @cached_method (closed: fixed)">#12215</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>
</p>
<pre class="wiki">sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 7.60 s, sys: 0.24 s, total: 7.85 s
Wall time: 8.00 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 6.03 s, sys: 0.21 s, total: 6.24 s
Wall time: 7.83 s
sage: def test(E):
....: for p in prime_range(10000):
....: if p != 389:
....: G = E.change_ring(GF(p)).abelian_group()
....:
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 31.59 s, sys: 0.21 s, total: 31.79 s
Wall time: 31.85 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 19.15 s, sys: 0.43 s, total: 19.58 s
Wall time: 20.13 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 3.81 s, sys: 0.01 s, total: 3.82 s
Wall time: 3.82 s
</pre><p>
Here is a new test. First, with "cold" cache:
</p>
<pre class="wiki">sage: L = [ZZ.quotient(p) for p in prime_range(10000)]
sage: %time X = [R in Fields() for R in L]
CPU times: user 0.42 s, sys: 0.01 s, total: 0.43 s
Wall time: 0.44 s
</pre><p>
And now with a "warm" cache:
</p>
<pre class="wiki">sage: %timeit X = [R in Fields() for R in L]
625 loops, best of 3: 1.37 ms per loop
</pre><p>
If one adds the tests from here, one gets:
</p>
<pre class="wiki">age: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 7.46 s, sys: 0.33 s, total: 7.79 s
Wall time: 7.80 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 5.82 s, sys: 0.15 s, total: 5.97 s
Wall time: 6.12 s
sage: def test(E):
....: for p in prime_range(10000):
....: if p != 389:
....: G = E.change_ring(GF(p)).abelian_group()
....:
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 31.43 s, sys: 0.10 s, total: 31.52 s
Wall time: 31.58 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 19.04 s, sys: 0.17 s, total: 19.22 s
Wall time: 19.26 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 2.88 s, sys: 0.00 s, total: 2.89 s
Wall time: 2.90 s
sage: L = [ZZ.quotient(p) for p in prime_range(10000)]
sage: %time X = [R in Fields() for R in L]
CPU times: user 0.40 s, sys: 0.00 s, total: 0.41 s
Wall time: 0.41 s
sage: %timeit X = [R in Fields() for R in L]
625 loops, best of 3: 1.29 ms per loop
</pre><p>
Hence, no test became slower, but some became clearly faster - in particular the one with echelon form computation. I think an echelon form computation was were Nils noticed the memory leak.
</p>
<p>
<strong><span class="underline">Fixing a memory leak</span></strong>
</p>
<p>
I just noticed that I forgot to add the following test to my patch. But anyway, a memory leak has been fixed that hadn't, before:
</p>
<pre class="wiki">sage: import gc
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
1
sage: for i in prime_range(100):
....: R = ZZ.quotient(i)
....: t = R in Fields()
....:
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
2
</pre><p>
However, not all is good. There is still another leak left, quite likely due to coercion.
</p>
<pre class="wiki">sage: for i in prime_range(100):
....: R = ZZ.quotient(i)
....: t = 1 + R.one()
....:
sage: _ = gc.collect()
sage: len([X for X in gc.get_objects() if isinstance(X, sage.rings.finite_rings.integer_mod_ring.IntegerModRing_generic)])
26
</pre><p>
My suggestion is to deal with the remaining leak on a different ticket. But first I need to add one test to my patch...
</p>
TicketSimonKingWed, 15 Aug 2012 12:42:36 GMTstatus changed; author set
https://trac.sagemath.org/ticket/13370#comment:2
https://trac.sagemath.org/ticket/13370#comment:2
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>needs_review</em>
</li>
<li><strong>author</strong>
set to <em>Simon King</em>
</li>
</ul>
<p>
The doc test is added.
</p>
<p>
I did not run the full test suite yet. But I think I'll already change it into "needs review".
</p>
TicketSimonKingWed, 15 Aug 2012 12:43:59 GMT
https://trac.sagemath.org/ticket/13370#comment:3
https://trac.sagemath.org/ticket/13370#comment:3
<p>
Oops, I think that the new doc test will only work with the other memleak fixes. If this turns out to be correct, I will need to update the dependencies.
</p>
TicketSimonKingWed, 15 Aug 2012 15:09:55 GMT
https://trac.sagemath.org/ticket/13370#comment:4
https://trac.sagemath.org/ticket/13370#comment:4
<p>
I don't know what the patchbot is trying to do. According to the logs, it looks like sage won't even start. But it does for me! Admittedly, I work with a prerelease of 5.2 (not 5.3).
</p>
TicketSimonKingWed, 15 Aug 2012 15:45:37 GMTdependencies set
https://trac.sagemath.org/ticket/13370#comment:5
https://trac.sagemath.org/ticket/13370#comment:5
<ul>
<li><strong>dependencies</strong>
set to <em>#11310, #715, #12215, #11521, #12313</em>
</li>
</ul>
<p>
The patch would not apply to sage-5.3.beta2, because of <a class="closed ticket" href="https://trac.sagemath.org/ticket/11310" title="enhancement: Monkey-patch catchall `except:` statements so they at least don't ... (closed: fixed)">#11310</a>. Since I worked on top of the other memleak fixes, I name them as dependencies as well.
</p>
TicketSimonKingWed, 15 Aug 2012 15:58:57 GMTstatus changed; work_issues set
https://trac.sagemath.org/ticket/13370#comment:6
https://trac.sagemath.org/ticket/13370#comment:6
<ul>
<li><strong>status</strong>
changed from <em>needs_review</em> to <em>needs_work</em>
</li>
<li><strong>work_issues</strong>
set to <em>Cope with import changes in sage/schemes</em>
</li>
</ul>
<p>
Too bad. The patch bot is right: The current patch won't work with sage-5.3.beta2, there is an import error at startup. It is in sage/schemes --- <em>again</em>.
</p>
TicketSimonKingWed, 15 Aug 2012 16:00:33 GMTdependencies changed
https://trac.sagemath.org/ticket/13370#comment:7
https://trac.sagemath.org/ticket/13370#comment:7
<ul>
<li><strong>dependencies</strong>
changed from <em>#11310, #715, #12215, #11521, #12313</em> to <em>#11310, #715, #12215, #11521, #12313, #13089</em>
</li>
</ul>
<p>
The reason is <a class="closed ticket" href="https://trac.sagemath.org/ticket/13089" title="enhancement: Implement weighted projective spaces. (closed: fixed)">#13089</a>.
</p>
TicketSimonKingWed, 15 Aug 2012 16:05:18 GMTstatus changed; work_issues deleted
https://trac.sagemath.org/ticket/13370#comment:8
https://trac.sagemath.org/ticket/13370#comment:8
<ul>
<li><strong>status</strong>
changed from <em>needs_work</em> to <em>needs_review</em>
</li>
<li><strong>work_issues</strong>
<em>Cope with import changes in sage/schemes</em> deleted
</li>
</ul>
<p>
The patch is rebased with respect to <a class="closed ticket" href="https://trac.sagemath.org/ticket/13089" title="enhancement: Implement weighted projective spaces. (closed: fixed)">#13089</a>. With the stated dependencies on top of sage-5.3.beta2, sage does start. So, I hope that the patchbot can work fine...
</p>
TicketSimonKingSun, 19 Aug 2012 13:30:18 GMTdependencies changed
https://trac.sagemath.org/ticket/13370#comment:9
https://trac.sagemath.org/ticket/13370#comment:9
<ul>
<li><strong>dependencies</strong>
changed from <em>#11310, #715, #12215, #11521, #12313, #13089</em> to <em>#11310, #12215, #11521, #12313, #13089</em>
</li>
</ul>
<p>
<a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a> and <a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a> are mutually dependent, but <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a> needs to be applied first. By consequence, the patchbot got confused when I stated <em>both</em> as dependency here. Hope it works now...
</p>
TicketSimonKingMon, 20 Aug 2012 06:26:52 GMT
https://trac.sagemath.org/ticket/13370#comment:10
https://trac.sagemath.org/ticket/13370#comment:10
<p>
The patchbot reports
</p>
<pre class="wiki">sage -t -force_lib devel/sage-13370/sage/rings/padics/padic_base_leaves.py # 1 doctests failed
sage -t -force_lib devel/sage-13370/sage/rings/polynomial/infinite_polynomial_element.py # Killed/crashed
</pre><p>
I do not get that crash, but I do get the error in padic_base_leaves.py. The strange thing is:
</p>
<pre class="wiki">sage: K = Qp(17)
sage: S = Zp(17,40)
sage: K.has_coerce_map_from(S)
True
</pre><p>
but in the doctest the last line returns "False".
</p>
<p>
In other words, there is yet again a case where the absence of a coerce map is cached when this is wrong. I really thought I had fixed that problem in <a class="closed ticket" href="https://trac.sagemath.org/ticket/12969" title="defect: Coercion failures in symmetric functions (closed: fixed)">#12969</a>!!
</p>
TicketSimonKingMon, 20 Aug 2012 06:53:31 GMT
https://trac.sagemath.org/ticket/13370#comment:11
https://trac.sagemath.org/ticket/13370#comment:11
<p>
Arrgh, it is even worse!
</p>
<p>
When I take the complete failing test, namely
</p>
<pre class="wiki"> sage: K = Qp(17)
sage: K(1) + 1 #indirect doctest
2 + O(17^20)
sage: K.has_coerce_map_from(ZZ)
True
sage: K.has_coerce_map_from(int)
True
sage: K.has_coerce_map_from(QQ)
True
sage: K.has_coerce_map_from(RR)
False
sage: K.has_coerce_map_from(Qp(7))
False
sage: K.has_coerce_map_from(Qp(17,40))
False
sage: K.has_coerce_map_from(Qp(17,10))
True
sage: K.has_coerce_map_from(Zp(17,40))
True
</pre><p>
then every answer is as expected. But when I run the same thing as part of the test suit eof sage.rings.padic_base_leaves, then the last line returns "False".
</p>
<p>
Hence, there is a side effect from another test. And keep in mind that this side-effect results in wrongly caching the absence of a coercion, even though the patch only changes the syntax for detecting fields.
</p>
TicketSimonKingMon, 20 Aug 2012 07:11:45 GMT
https://trac.sagemath.org/ticket/13370#comment:12
https://trac.sagemath.org/ticket/13370#comment:12
<p>
It is worse than worse: According to the patchbot, the problem in padic_base_leaves also occurs with <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>, i.e., before applying <a class="closed ticket" href="https://trac.sagemath.org/ticket/13370" title="defect: Do not cache the result of is_Field externally (closed: fixed)">#13370</a>. But I do not get that error!
</p>
<p>
Hence: The patchbot finds a wrong coercion cache with <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>, which I only find with <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>. And the patchbot finds a segfault with <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>, that I do not find.
</p>
TicketSimonKingMon, 20 Aug 2012 09:19:27 GMT
https://trac.sagemath.org/ticket/13370#comment:13
https://trac.sagemath.org/ticket/13370#comment:13
<p>
I inserted some lines into sage.structure.parent that print type(S) and id(S) whenever the absence of a coercion gets cached. And I inserted a line into the failing test that prints type and id of the object that fails.
</p>
<p>
It turns out that by the test
</p>
<pre class="wiki">sage: K.has_coerce_map_from(Qp(17,40))
</pre><p>
in line 478 of sage/rings/padics/padic_base_leaves.py, the absence of a coercion gets cached for an object of type <code>sage.rings.padics.padic_base_leaves.pAdicFieldCappedRelative_with_category</code> under the address 79378272. However, a few lines below, when the test fails, the object at address 79378272 is of type <code>sage.rings.padics.padic_base_leaves.pAdicRingCappedRelative_with_category</code> (Ring, not Field!!)
</p>
<p>
Hence, it seems that the following happens: The absence of a coercion from <code>Qp(17,40)</code> to K is <em>correctly</em> cached. But because of all my weakref patches, <code>Qp(17,40)</code> gets garbage collected. For an unknown reason, the corresponding item in the coercion cache does <em>not</em> get removed, and by coincidence the old address of <code>Qp(17,40)</code> is now used for <code>Zp(17,40)</code>. But since the coercion cache is now operating with the addresses of keys, it gets the wrong item. That would explain why the error is so difficult to reproduce.
</p>
<p>
If this is true, then we have to understand why the callback function of the weak reference to <code>Qp(17,40)</code> is not called as soon as <code>Qp(17,40)</code> gets garbage collected.
</p>
TicketSimonKingMon, 20 Aug 2012 13:27:30 GMT
https://trac.sagemath.org/ticket/13370#comment:14
https://trac.sagemath.org/ticket/13370#comment:14
<p>
My hope was that the new fixes in <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a> and <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a> ("let the <code>__delitem__</code> methods not only remove stuff from the buckets but also from the reference cache") would also fix the problem here. But unfortunately that isn't the case. And since the patch bot is currently not usable, I can not tell whether the meta-problem ("The patchbot finds the error already with <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a> but I don't, and the patchbot finds an additional error with <a class="closed ticket" href="https://trac.sagemath.org/ticket/13370" title="defect: Do not cache the result of is_Field externally (closed: fixed)">#13370</a> but I don't.") persists.
</p>
TicketnbruinTue, 21 Aug 2012 02:35:03 GMT
https://trac.sagemath.org/ticket/13370#comment:15
https://trac.sagemath.org/ticket/13370#comment:15
<p>
I'm not getting any problems on 5.3b2 with the patches suggested here applied. See also <a class="ext-link" href="http://trac.sagemath.org/sage_trac/ticket/12313#comment:145"><span class="icon"></span>http://trac.sagemath.org/sage_trac/ticket/12313#comment:145</a> for a test where I try to recreate the conditions suggested above. I'm not observing the erroneous behaviour. So the problem must be a little more subtle. The proper removal from the coercion cache does occur sometimes (always as far as I can see).
</p>
<p>
Hopefully this data point helps.
</p>
TicketSimonKingTue, 21 Aug 2012 07:57:19 GMT
https://trac.sagemath.org/ticket/13370#comment:16
https://trac.sagemath.org/ticket/13370#comment:16
<p>
I was modifying the failing test by inserting some lines that show at what memory location stuff is stored, and what is cached when. So, the test is:
</p>
<pre class="wiki"> sage: K = Qp(17)
sage: K(1) + 1 #indirect doctest
2 + O(17^20)
sage: K.has_coerce_map_from(ZZ)
True
sage: K.has_coerce_map_from(int)
True
sage: K.has_coerce_map_from(QQ)
True
sage: K.has_coerce_map_from(RR)
False
sage: K.has_coerce_map_from(Qp(7))
False
sage: id(Qp(17,40))
sage: K.has_coerce_map_from(Qp(17,40))
False
sage: K.has_coerce_map_from(Qp(17,10))
True
sage: S = Zp(17,40); id(S)
sage: K.is_coercion_cached(S)
sage: K.has_coerce_map_from(S)
True
sage: import gc
sage: _ = gc.collect()
sage: K.is_coercion_cached(S)
sage: K.has_coerce_map_from(S)
</pre><p>
and this results in
</p>
<pre class="wiki">sage -t "devel/sage/sage/rings/padics/padic_base_leaves.py"
**********************************************************************
File "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage/sage/rings/padics/padic_base_leaves.py", line 478:
sage: id(Qp(17,40))
Expected nothing
Got:
79371136
**********************************************************************
File "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage/sage/rings/padics/padic_base_leaves.py", line 483:
sage: S = Zp(17,40); id(S)
Expected nothing
Got:
79371136
**********************************************************************
File "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage/sage/rings/padics/padic_base_leaves.py", line 484:
sage: K.is_coercion_cached(S)
Expected nothing
Got:
True
**********************************************************************
File "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage/sage/rings/padics/padic_base_leaves.py", line 485:
sage: K.has_coerce_map_from(S)
Expected:
True
Got:
False
**********************************************************************
File "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage/sage/rings/padics/padic_base_leaves.py", line 489:
sage: K.is_coercion_cached(S)
Expected nothing
Got:
True
**********************************************************************
File "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage/sage/rings/padics/padic_base_leaves.py", line 490:
sage: K.has_coerce_map_from(S)
Expected nothing
Got:
False
</pre><p>
Hence:
</p>
<ul><li>In line 483, apparently the garbage collection of <code>Qp(17,40)</code> has already occurred, for otherwise <code>Zp(17,40)</code> could not have been created under the same address.
</li><li>In line 484, apparently the entry to <code>Qp(17,40)</code> is still not removed. So, for whatever reason, the callback function for the weak reference to <code>Qp(17,40)</code> is not called. Why?
</li></ul><p>
My hope was that by the garbage collection, the callback function of the weak reference to <code>Qp(17,40)</code> would finally be called. But apparently that is not the case. It turns out that doing <code>gc.collect()</code> before defining S does not change the outcome.
</p>
TicketSimonKingTue, 21 Aug 2012 08:11:32 GMT
https://trac.sagemath.org/ticket/13370#comment:17
https://trac.sagemath.org/ticket/13370#comment:17
<p>
A potential solution comes to mind: In <code>__contains__</code> and also in <code>__getitem__</code>, one could first test whether the memory location of the keys is right, and then test whether the weak references in store are still valid. If they aren't, then the key is a new item under an old address. But that would belong to <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a> and <a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a>.
</p>
TicketnbruinTue, 21 Aug 2012 08:55:50 GMT
https://trac.sagemath.org/ticket/13370#comment:18
https://trac.sagemath.org/ticket/13370#comment:18
<p>
Of potential relevance is:
</p>
<p>
<a class="ext-link" href="http://bugs.python.org/issue1055820"><span class="icon"></span>http://bugs.python.org/issue1055820</a>
</p>
<p>
Python takes great care in dealing with weakrefs and cyclic garbage collection. See in particular the <a class="ext-link" href="http://hg.python.org/cpython/file/default/Modules/gc_weakref.txt"><span class="icon"></span>explanation</a> and the <a class="ext-link" href="http://hg.python.org/cpython/file/default/Modules/gcmodule.c#l578"><span class="icon"></span>code</a>. From what I understand, python GC does not always call callbacks on weakrefs that are in the cyclic trash themselves but does promise to do callbacks on weakrefs to objects that are collected from cyclic trash. It does so by *first* clearing all the trash, saving the callbacks, and then making the callbacks.
</p>
<p>
I guess that means that callbacks might see a slightly warped world, where other weakreffed locations may already have been vacated but the callbacks on those haven't run yet. Our callbacks are so simple that I don't see how that could affect them, though.
</p>
<p>
We're storing our callbacks in a <code>dict</code> right on the <code>MonoDict</code>, so the weakrefs are never garbage if we still care about the <code>MonoDict</code>. If you're not getting callbacks then it seems that python is breaking its promise. It might be worth investigating, because if python is really at fault, we don't need a workaround but a fix to python!
</p>
TicketSimonKingTue, 21 Aug 2012 09:09:54 GMT
https://trac.sagemath.org/ticket/13370#comment:19
https://trac.sagemath.org/ticket/13370#comment:19
<p>
Hi Nils,
</p>
<p>
Thank you for the pointer!
</p>
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:18" title="Comment 18">nbruin</a>:
</p>
<blockquote class="citation">
<p>
It does so by *first* clearing all the trash, saving the callbacks, and then making the callbacks.
</p>
</blockquote>
<p>
I don't know if waiting for a fix to Python is a good idea. But what you say confirms that the fix that I suggested makes sense: Python first clears all trash; in that moment, the weak references are invalid (i.e., if you call them, None is returned), but the callback function has not necessarily been called yet.
</p>
<p>
Hence, the existence of the callback does not guarantee that we have no dead weak references in store. That's to say: We must always validate the weak references. I just hope that the slow-down is not too much.
</p>
<p>
Anyway. I have already verified that the error in padic_base_leaves.py vanishes with the new patches from <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a> and <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>, and am now running make ptest for <a class="closed ticket" href="https://trac.sagemath.org/ticket/13370" title="defect: Do not cache the result of is_Field externally (closed: fixed)">#13370</a> (and will also kick the patchbot).
</p>
TicketjpfloriTue, 21 Aug 2012 09:29:22 GMT
https://trac.sagemath.org/ticket/13370#comment:20
https://trac.sagemath.org/ticket/13370#comment:20
<p>
Nice find, this may indeed and hopefully be the root of all the random failures we got previously (which quite surprisingly also involved Zp and Qp, maybe there is something else fishy with these sets).
</p>
<p>
I'll try testing and timings the patches asap.
Are any other major modifications planned? e.g. some more intrusive changes to fix the memleak pointed out by Nils in <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>?
Or do you plan on "saving" that for later?
</p>
TicketSimonKingTue, 21 Aug 2012 09:46:34 GMT
https://trac.sagemath.org/ticket/13370#comment:21
https://trac.sagemath.org/ticket/13370#comment:21
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:20" title="Comment 20">jpflori</a>:
</p>
<blockquote class="citation">
<p>
Are any other major modifications planned? e.g. some more intrusive changes to fix the memleak pointed out by Nils in <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>?
Or do you plan on "saving" that for later?
</p>
</blockquote>
<p>
I think that progress is only possible if at some point one calls it a day. That's to say, if the patches fix some memory leaks and create no evident problem, then the existence of other not-yet-fixed memory leaks should not prevent them from being merged.
</p>
TicketjpfloriTue, 21 Aug 2012 09:47:51 GMT
https://trac.sagemath.org/ticket/13370#comment:22
https://trac.sagemath.org/ticket/13370#comment:22
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:21" title="Comment 21">SimonKing</a>:
</p>
<blockquote class="citation">
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:20" title="Comment 20">jpflori</a>:
</p>
<blockquote class="citation">
<p>
Are any other major modifications planned? e.g. some more intrusive changes to fix the memleak pointed out by Nils in <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>?
Or do you plan on "saving" that for later?
</p>
</blockquote>
<p>
I think that progress is only possible if at some point one calls it a day. That's to say, if the patches fix some memory leaks and create no evident problem, then the existence of other not-yet-fixed memory leaks should not prevent them from being merged.
</p>
</blockquote>
<p>
Agreed, I'll begin putting everything in place for testing with the current patches.
</p>
TicketSimonKingTue, 21 Aug 2012 13:51:19 GMT
https://trac.sagemath.org/ticket/13370#comment:23
https://trac.sagemath.org/ticket/13370#comment:23
<p>
Hooray, finally the patchbot and I agree! I get the same segfault in infinite_polynomial_element (embarrassingly, that module is my own code...). And if I only add patches out to <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>, the test works fine.
</p>
<p>
Hence, I have to understand how the new syntax and/or avoiding an additional strong reference to a field after testing that it is a field can result in a segfault.
</p>
TicketSimonKingTue, 21 Aug 2012 13:55:30 GMT
https://trac.sagemath.org/ticket/13370#comment:24
https://trac.sagemath.org/ticket/13370#comment:24
<p>
But alas, the segfault is not reproducible. I got it with make ptest. But when I now run the test individually, for about 30 times, everything went fine (and is 0.3 seconds faster than without the patch). That's frustrating.
</p>
TicketSimonKingTue, 21 Aug 2012 14:03:39 GMT
https://trac.sagemath.org/ticket/13370#comment:25
https://trac.sagemath.org/ticket/13370#comment:25
<p>
According to the log, the mysterious segfault has something to do with actions:
</p>
<pre class="wiki">sage -t -force_lib devel/sage-13370/sage/rings/polynomial/infinite_polynomial_element.py
/opt/patchbot-5.3.beta2/local/lib/libcsage.so(print_backtrace+0x3b)[0xb6e80c49]
/opt/patchbot-5.3.beta2/local/lib/libcsage.so(sigdie+0x17)[0xb6e80c89]
/opt/patchbot-5.3.beta2/local/lib/libcsage.so(sage_signal_handler+0x212)[0xb6e8078c]
[0xb7704400]
/opt/patchbot-5.3.beta2/local/lib/libpython2.7.so.1.0(+0x29836)[0xb757e836]
/opt/patchbot-5.3.beta2/local/lib/libpython2.7.so.1.0(+0x4685a)[0xb759b85a]
/opt/patchbot-5.3.beta2/local/lib/libpython2.7.so.1.0(PyObject_Call+0x64)[0xb7583094]
/opt/patchbot-5.3.beta2/local/lib/python/site-packages/sage/categories/action.so(+0x9ef8)[0xb6125ef8]
</pre><p>
Aha! Right now I got a bit of insight! Namely, I attempted parallel tests of sage/rings/polynomial/, and find a warning printed:
</p>
<pre class="wiki">Exception NotImplementedError: NotImplementedError() in 'sage.rings.ring._is_Field' ignored
</pre><p>
So, that's a problem. Since <code>_is_Field</code> is a cdef function, it won't propagate a <code>NotImplementedError</code>. I will now test whether catching the error and returning "False" when it occurs helps to solve the problem.
</p>
TicketSimonKingTue, 21 Aug 2012 14:10:37 GMT
https://trac.sagemath.org/ticket/13370#comment:26
https://trac.sagemath.org/ticket/13370#comment:26
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:25" title="Comment 25">SimonKing</a>:
</p>
<blockquote class="citation">
<p>
So, that's a problem. Since <code>_is_Field</code> is a cdef function, it won't propagate a <code>NotImplementedError</code>. I will now test whether catching the error and returning "False" when it occurs helps to solve the problem.
</p>
</blockquote>
<p>
It doesn't solve the problem. Instead, when running the tests of sage/rings/polynomial parallely in four threads, I again get the segfault in infinite_polynomial_element.
</p>
<p>
So, the segfault depends on whether or not we have one or four doctests at a time. Nasty.
</p>
TicketSimonKingTue, 28 Aug 2012 07:16:51 GMT
https://trac.sagemath.org/ticket/13370#comment:27
https://trac.sagemath.org/ticket/13370#comment:27
<p>
With the new patch versions at <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a> and friends, the patchbot still reports
</p>
<pre class="wiki">sage -t -force_lib devel/sage-13370/sage/rings/polynomial/polynomial_real_mpfr_dense.pyx
The doctested process was killed by signal 11
</pre><p>
Signal 11 is a segmentation fault.
</p>
<p>
However, I do <em>not</em> get that error. Both when I run it individually, or as part of "sage -tp 4", it works fine! Or at least it does, together with the patches from <a class="new ticket" href="https://trac.sagemath.org/ticket/13400" title="enhancement: Use strong caches diligently (new)">#13400</a>.
</p>
TicketSimonKingTue, 28 Aug 2012 09:21:55 GMT
https://trac.sagemath.org/ticket/13370#comment:28
https://trac.sagemath.org/ticket/13370#comment:28
<p>
With sage-5.3.beta2 plus <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12215" title="defect: Memleak in UniqueRepresentation, @cached_method (closed: fixed)">#12215</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a> and <a class="closed ticket" href="https://trac.sagemath.org/ticket/13370" title="defect: Do not cache the result of is_Field externally (closed: fixed)">#13370</a>, make ptest works fine for me. So, once again, we have the problem that the patchbot reports a segfault that I can't reproduce.
</p>
TicketnbruinThu, 30 Aug 2012 19:15:11 GMT
https://trac.sagemath.org/ticket/13370#comment:29
https://trac.sagemath.org/ticket/13370#comment:29
<p>
This is a pretty desperate suggestion, but given our earlier experience, the segfault reported by the bot is probably due to an actual bug that's hard to trigger. Getting the verbose traceback on the computer that runs the test would probably give the best lead, but if that's not possible:
</p>
<p>
<strong>Run the doctest under valgrind.</strong> I've tried that myself, but valgrind is giving me a SIGILL:
</p>
<pre class="wiki">/usr/local/sage/5.3b2/local/lib/libcsage.so(print_backtrace+0x31)[0x1220a0a7]
/usr/local/sage/5.3b2/local/lib/libcsage.so(sigdie+0x14)[0x1220a0d9]
/usr/local/sage/5.3b2/local/lib/libcsage.so(sage_signal_handler+0x1ca)[0x12209c6a]
/lib64/libpthread.so.0(+0xf500)[0x5218500]
/usr/local/sage/5.3b2/local/lib/libmpfr.so.4(mpfr_set_d+0x14)[0x1b9d37c4]
/usr/local/sage/5.3b2/local/lib/python2.7/site-packages/sage/rings/real_mpfr.so(+0x24aa2)[0x1b76daa2]
/usr/local/sage/5.3b2/local/lib/libpython2.7.so.1.0(+0xa7d88)[0x4ed8d88]
/usr/local/sage/5.3b2/local/lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x4e7c3c3]
/usr/local/sage/5.3b2/local/lib/python2.7/site-packages/sage/rings/real_mpfr.so(+0x13a3a)[0x1b75ca3a]
/usr/local/sage/5.3b2/local/lib/python2.7/site-packages/sage/rings/real_mpfr.so(initreal_mpfr+0x4057)[0x1b788667]
...
------------------------------------------------------------------------
Unhandled SIGILL: An illegal instruction occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
/usr/local/sage/5.3b2/local/bin/sage-valgrind: line 24: 10490 Illegal instruction (core dumped) valgrind --tool=memcheck $MEMCHECK_FLAGS python -i
</pre><p>
Valgrind is reporting in its log file:
</p>
<pre class="wiki">vex amd64->IR: unhandled instruction bytes: 0xC5 0xF9 0x2E 0xC0 0xC5 0xFB
==10490== valgrind: Unrecognised instruction at address 0x1b9d37c4.
==10490== Your program just tried to execute an instruction that Valgrind
==10490== did not recognise. There are two possible reasons for this.
==10490== 1. Your program has a bug and erroneously jumped to a non-code
==10490== location. If you are running Memcheck and you just saw a
==10490== warning about a bad jump, it's probably your program's fault.
==10490== 2. The instruction is legitimate but Valgrind doesn't handle it,
==10490== i.e. it's Valgrind's fault. If you think this is the case or
==10490== you are not sure, please let us know and we'll try to fix it.
==10490== Either way, Valgrind will now raise a SIGILL signal which will
==10490== probably kill your program.
</pre><p>
That might just be that there's something funny with my valgrind. I'm not going to fix that, because valgrind output doesn't mean anything to me anyway. However, I do think it's peculiar that the traceback leads so directly to <code>mpfr</code>! That's just running <code>sage -valgrind</code>, by the way.
</p>
<p>
The main idea is that our first guess would be that the spurious segfaults are due to memory corruptions and that the cause of this corruption is probably a systematic misuse of memory somewhere. So even on a system where the segfault doesn't happen, the memory misuse should still happen and hopefully valgrind would give an indication where.
</p>
TicketSimonKingThu, 30 Aug 2012 20:52:03 GMT
https://trac.sagemath.org/ticket/13370#comment:30
https://trac.sagemath.org/ticket/13370#comment:30
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:29" title="Comment 29">nbruin</a>:
</p>
<blockquote class="citation">
<p>
This is a pretty desperate suggestion, but given our earlier experience, the segfault reported by the bot is probably due to an actual bug that's hard to trigger. Getting the verbose traceback on the computer that runs the test would probably give the best lead, but if that's not possible:
</p>
<p>
<strong>Run the doctest under valgrind.</strong>
</p>
</blockquote>
<p>
How to do so? When I was young, one needed to install a valgrind spkg. Is that still needed?
</p>
<blockquote class="citation">
<p>
I've tried that myself, but valgrind is giving me a SIGILL:
</p>
</blockquote>
<p>
Are you saying: Even if I do not get a segfault (these sporadic segfaults are hunting me here and on different other tickets), valgrind might still point me to the problem?
</p>
<p>
I'd certainly try! But I need a pointer. <code>sage -valgrind</code> doesn't work for me:
</p>
<pre class="wiki">king@mpc622:/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage$ ../../sage -valgrind
----------------------------------------------------------------------
| Sage Version 5.3.beta2, Release Date: 2012-08-14 |
| Type "notebook()" for the browser-based notebook interface. |
| Type "help()" for help. |
----------------------------------------------------------------------
**********************************************************************
* *
* Warning: this is a prerelease version, and it may be unstable. *
* *
**********************************************************************
/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/local/bin/sage-ipython
Log file is /mnt/local/king/.sage/valgrind/sage-memcheck.%p
Using default flags:
--leak-resolution=high --log-file=/mnt/local/king/.sage/valgrind/sage-memcheck.%p --leak-check=full --num-callers=25 --suppressions=/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/local/lib/valgrind/sage.supp
king@mpc622:/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage$
</pre><p>
Hence, I don't come to the Sage prompt, it quits before starting.
</p>
<p>
Aha! I just found that valgrind wrote into some file:
</p>
<pre class="wiki">==23108== Memcheck, a memory error detector
==23108== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==23108== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==23108== Command: python -i
==23108== Parent PID: 22997
==23108==
==23108== FATAL: can't open suppressions file "/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/local/lib/valgrind/sage.supp"
</pre><p>
So, some file is missing.
</p>
TicketSimonKingThu, 30 Aug 2012 20:53:52 GMT
https://trac.sagemath.org/ticket/13370#comment:31
https://trac.sagemath.org/ticket/13370#comment:31
<p>
See <a class="closed ticket" href="https://trac.sagemath.org/ticket/11918" title="defect: Sage should ship its Valgrind suppressions file, or not insist on its ... (closed: fixed)">#11918</a> for the valgrind issue with sage.supp
</p>
TicketjpfloriFri, 31 Aug 2012 06:30:04 GMT
https://trac.sagemath.org/ticket/13370#comment:32
https://trac.sagemath.org/ticket/13370#comment:32
<p>
There still is some valgrind spkg.
But it won't compile on any reasonably recent distro.
There is some update in <a class="closed ticket" href="https://trac.sagemath.org/ticket/7766" title="enhancement: Upgrade optional spkg valgrind to valgrind-3.7.0 (closed: fixed)">#7766</a>, which will compile, but will not support AVX iinstruction yet.
I guess the AVX instruction problem might be the reason of the SIGILL Nils get (do you have a Core i7?).
For that, you'll need valgrind 3.8.0 which was released a few weeks ago (and is not ditributed in most distro yet I think).
You can just replace the src dir in the spkg with the new tarball.
</p>
<p>
If you used your system valgrind, it could be a reason why the suppression file is missing as well.
<a class="closed ticket" href="https://trac.sagemath.org/ticket/11918" title="defect: Sage should ship its Valgrind suppressions file, or not insist on its ... (closed: fixed)">#11918</a> is surely about that.
</p>
TicketjpfloriFri, 31 Aug 2012 06:31:44 GMT
https://trac.sagemath.org/ticket/13370#comment:33
https://trac.sagemath.org/ticket/13370#comment:33
<p>
Hum, I did not mean <a class="closed ticket" href="https://trac.sagemath.org/ticket/7766" title="enhancement: Upgrade optional spkg valgrind to valgrind-3.7.0 (closed: fixed)">#7766</a>, which points to 3.7.0, but <a class="closed ticket" href="https://trac.sagemath.org/ticket/13060" title="defect: Update Valgrind spkg to version 3.8.1 (closed: fixed)">#13060</a> which points to something in between 3.7.0 and 3.8.0.
</p>
TicketSimonKingFri, 31 Aug 2012 07:37:43 GMT
https://trac.sagemath.org/ticket/13370#comment:34
https://trac.sagemath.org/ticket/13370#comment:34
<p>
Thanks for the pointer, Jean-Pierre!
</p>
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:33" title="Comment 33">jpflori</a>:
</p>
<blockquote class="citation">
<p>
Hum, I did not mean <a class="closed ticket" href="https://trac.sagemath.org/ticket/7766" title="enhancement: Upgrade optional spkg valgrind to valgrind-3.7.0 (closed: fixed)">#7766</a>, which points to 3.7.0, but <a class="closed ticket" href="https://trac.sagemath.org/ticket/13060" title="defect: Update Valgrind spkg to version 3.8.1 (closed: fixed)">#13060</a> which points to something in between 3.7.0 and 3.8.0.
</p>
</blockquote>
<p>
OK, I took the 3.7.0 spkg from <a class="closed ticket" href="https://trac.sagemath.org/ticket/13060" title="defect: Update Valgrind spkg to version 3.8.1 (closed: fixed)">#13060</a>, replaced the src/ folder by the 3.8.0 tarball, and did just install it. If I recall correctly, one needs sage -ba in order to make valgrind work right. Is that correct? Well, I just do, even if it is not correct...
</p>
TicketSimonKingFri, 31 Aug 2012 08:01:14 GMT
https://trac.sagemath.org/ticket/13370#comment:35
https://trac.sagemath.org/ticket/13370#comment:35
<p>
I valgrind-tested sage/rings/polynomial/infinite_polynomial_ring, which uses to segfault pm Volker's patchbot with <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12215" title="defect: Memleak in UniqueRepresentation, @cached_method (closed: fixed)">#12215</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/13370" title="defect: Do not cache the result of is_Field externally (closed: fixed)">#13370</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/13378" title="defect: Do not cache the non-existence of coerce/convert map too often, and do ... (closed: fixed)">#13378</a>, 13412 and <a class="closed ticket" href="https://trac.sagemath.org/ticket/12876" title="enhancement: Fix element and parent classes of Hom categories to be abstract, and ... (closed: fixed)">#12876</a>. However, everything seems to work fine. Next, I'll test the real_mpfr problem
</p>
TicketSimonKingFri, 31 Aug 2012 08:48:51 GMT
https://trac.sagemath.org/ticket/13370#comment:36
https://trac.sagemath.org/ticket/13370#comment:36
<p>
With the patches mentioned in my previous post and with a valgrind-3.8.0 spkg (replacing the src of Jean-Pierre's 3.7.0 spkg with the new sources), valgrinding polynomial_real_mpfr_dense.pyx seems to work fine:
</p>
<pre class="wiki">...
==9230== LEAK SUMMARY:
==9230== definitely lost: 441 bytes in 8 blocks
==9230== indirectly lost: 320 bytes in 15 blocks
==9230== possibly lost: 4,950,895 bytes in 5,597 blocks
==9230== still reachable: 80,189,206 bytes in 41,378 blocks
==9230== suppressed: 0 bytes in 0 blocks
==9230== Reachable blocks (those to which a pointer was found) are not shown.
==9230== To see them, rerun with: --leak-check=full --show-reachable=yes
==9230==
==9230== For counts of detected and suppressed errors, rerun with: -v
==9230== Use --track-origins=yes to see where uninitialised values come from
==9230== ERROR SUMMARY: 12833 errors from 1495 contexts (suppressed: 1008 from 9)
</pre><p>
In particular, there is no SIGILL.
</p>
TicketjpfloriFri, 31 Aug 2012 09:22:50 GMT
https://trac.sagemath.org/ticket/13370#comment:37
https://trac.sagemath.org/ticket/13370#comment:37
<p>
I don't think the sage -ba or even sage -b is needed, but it does not hurt for sure.
</p>
TicketjpfloriFri, 31 Aug 2012 09:25:15 GMT
https://trac.sagemath.org/ticket/13370#comment:38
https://trac.sagemath.org/ticket/13370#comment:38
<p>
Did you look for "Invalid free()" in the file ?
</p>
TicketSimonKingFri, 31 Aug 2012 09:29:55 GMT
https://trac.sagemath.org/ticket/13370#comment:39
https://trac.sagemath.org/ticket/13370#comment:39
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:38" title="Comment 38">jpflori</a>:
</p>
<blockquote class="citation">
<p>
Did you look for "Invalid free()" in the file ?
</p>
</blockquote>
<p>
I just did. Even "invalid " is not in the file.
</p>
TicketjpfloriFri, 31 Aug 2012 09:31:48 GMT
https://trac.sagemath.org/ticket/13370#comment:40
https://trac.sagemath.org/ticket/13370#comment:40
<p>
Mmmmm, not having any "Invalid" is strange.
IIRC I got a bunch of invalid reads or writes even on a vanilla Sage.
And anyway GMP/MPIR usually generates some which are perfectly legal but trouble Valgrind.
</p>
TicketSimonKingFri, 31 Aug 2012 09:58:58 GMT
https://trac.sagemath.org/ticket/13370#comment:41
https://trac.sagemath.org/ticket/13370#comment:41
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:40" title="Comment 40">jpflori</a>:
</p>
<blockquote class="citation">
<p>
Mmmmm, not having any "Invalid" is strange.
IIRC I got a bunch of invalid reads or writes even on a vanilla Sage.
And anyway GMP/MPIR usually generates some which are perfectly legal but trouble Valgrind.
</p>
</blockquote>
<p>
OK, you are right. I repeated the whole thing and tried to take care that everything is as in the patchbot.
</p>
<p>
I have sage-5.3.beta2 and
</p>
<pre class="wiki">king@mpc622:/mnt/local/king/SAGE/prereleases/sage-5.3.beta2/devel/sage$ hg qa
trac_715_combined.patch
trac_715_local_refcache.patch
trac_715_safer.patch
trac_715_specification.patch
trac_11521_homset_weakcache_combined.patch
trac_11521_callback.patch
trac12215_weak_cached_function-sk.patch
trac12215_segfault_fixes.patch
trac_12313-mono_dict-combined-random-sk.patch
trac_12313_quit_sage.patch
trac13370_deprecate_is_field.patch
</pre><p>
(the same as the patchbot).
</p>
<p>
Then I did
</p>
<pre class="wiki">king@mpc622:/mnt/local/king/SAGE/prereleases/sage-5.3.beta2$ ./sage -t -valgrind devel/sage/sage/rings/polynomial/polynomial_real_mpfr_dense.pyx
sage -t -valgrind "devel/sage/sage/rings/polynomial/polynomial_real_mpfr_dense.pyx"
[53.0 s]
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 53.1 seconds
king@mpc622:/mnt/local/king/SAGE/prereleases/sage-5.3.beta2$ ls -ltr /mnt/local/king/.sage/valgrind/
insgesamt 50760
-rw------- 1 king malg 386 3. Mai 15:58 sage-memcheck.5510
-rw------- 1 king malg 3788215 4. Mai 23:44 sage-memcheck.1426
-rw------- 1 king malg 398 30. Aug 22:45 sage-memcheck.23108
-rw------- 1 king malg 8770111 30. Aug 22:57 sage-memcheck.30735
-rw------- 1 king malg 10453868 30. Aug 23:00 sage-memcheck.18736
-rw------- 1 king malg 8924220 30. Aug 23:13 sage-memcheck.trac12876
-rw------- 1 king malg 3102893 31. Aug 09:56 sage-memcheck.11856
-rw------- 1 king malg 7761738 31. Aug 10:42 sage-memcheck.9230
-rw------- 1 king malg 9069029 31. Aug 11:49 sage-memcheck.12255
</pre><p>
And I posted <a class="ext-link" href="http://sage.math.washington.edu/home/SimonKing/logs/sage-memcheck.12255"><span class="icon"></span>sage-memcheck.12255</a>. It does contain invalid read, but no invalid free.
</p>
TicketjpfloriFri, 31 Aug 2012 12:08:46 GMTcc changed
https://trac.sagemath.org/ticket/13370#comment:42
https://trac.sagemath.org/ticket/13370#comment:42
<ul>
<li><strong>cc</strong>
<em>jpflori</em> added
</li>
</ul>
TicketSimonKingSat, 01 Sep 2012 22:06:12 GMTdependencies changed
https://trac.sagemath.org/ticket/13370#comment:43
https://trac.sagemath.org/ticket/13370#comment:43
<ul>
<li><strong>dependencies</strong>
changed from <em>#11310, #12215, #11521, #12313, #13089</em> to <em>#11310, #12215, #11521, #12313, #13089, #13145</em>
</li>
</ul>
<p>
At <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a>, refcounting for certain rings is fixed. I wonder whether this is related with the segfault that can be observed on some computers.
</p>
<p>
I can not reproduce the segfault on Volker's patchbot. So, I'd like to see whether adding <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a> suffices to make the patchbot happy.
</p>
TicketSimonKingSat, 01 Sep 2012 23:13:25 GMT
https://trac.sagemath.org/ticket/13370#comment:44
https://trac.sagemath.org/ticket/13370#comment:44
<p>
Hooray!! with <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a>, the segfault is gone on the two patchbots that had it reported previously! The remaining doctest failure on one of the two machines seems unrelated (it can not find some html doc file for symbolic expressions). Hence, I am confident that <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a> was at least a step to the right direction.
</p>
TicketSimonKingMon, 03 Sep 2012 09:34:12 GMT
https://trac.sagemath.org/ticket/13370#comment:45
https://trac.sagemath.org/ticket/13370#comment:45
<p>
I am not sure if <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a> really fixed the problem, or it was just accidentally (these segfaults seem awfully flaky).
</p>
<p>
Anyway, I have updated the main patch at <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>, avoiding a potential name conflict of attributes of Action and <code>RingHomomorphism</code> - just in case...
</p>
<p>
Kicking the patchbot now...
</p>
TicketSimonKingMon, 03 Sep 2012 12:12:30 GMTdependencies changed
https://trac.sagemath.org/ticket/13370#comment:46
https://trac.sagemath.org/ticket/13370#comment:46
<ul>
<li><strong>dependencies</strong>
changed from <em>#11310, #12215, #11521, #12313, #13089, #13145</em> to <em>#11310, #12215, #11521, #12313, #13089</em>
</li>
</ul>
<p>
No segfault this time. I am retrying without <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a>.
</p>
TicketSimonKingWed, 06 Feb 2013 12:28:39 GMT
https://trac.sagemath.org/ticket/13370#comment:47
https://trac.sagemath.org/ticket/13370#comment:47
<p>
Is there someone who'd review this?
</p>
TicketSimonKingWed, 06 Feb 2013 12:31:53 GMTstatus, dependencies changed
https://trac.sagemath.org/ticket/13370#comment:48
https://trac.sagemath.org/ticket/13370#comment:48
<ul>
<li><strong>status</strong>
changed from <em>needs_review</em> to <em>needs_work</em>
</li>
<li><strong>dependencies</strong>
changed from <em>#11310, #12215, #11521, #12313, #13089</em> to <em>#11310, #12215, #11521, #12313, #13089, #12995</em>
</li>
</ul>
<p>
Needs to be rebased on top of sage-5.7.beta2
</p>
TicketSimonKingWed, 06 Feb 2013 12:34:45 GMTstatus changed
https://trac.sagemath.org/ticket/13370#comment:49
https://trac.sagemath.org/ticket/13370#comment:49
<ul>
<li><strong>status</strong>
changed from <em>needs_work</em> to <em>needs_review</em>
</li>
</ul>
<p>
Patch is updated!
</p>
TicketnbruinWed, 06 Feb 2013 16:14:03 GMT
https://trac.sagemath.org/ticket/13370#comment:50
https://trac.sagemath.org/ticket/13370#comment:50
<p>
I'm all in favour of the idea of the patch. However, the bot seems to indicate there's still some work to do.
</p>
TicketSimonKingWed, 06 Feb 2013 16:16:12 GMT
https://trac.sagemath.org/ticket/13370#comment:51
https://trac.sagemath.org/ticket/13370#comment:51
<p>
Right. Polyhedron seems to be new, so that this rather old patch did not take care of it.
</p>
TicketSimonKingThu, 07 Feb 2013 13:47:45 GMTattachment set
https://trac.sagemath.org/ticket/13370
https://trac.sagemath.org/ticket/13370
<ul>
<li><strong>attachment</strong>
set to <em>trac13370_deprecate_is_field.patch</em>
</li>
</ul>
<p>
Deprecate <code>is_Field(R)</code> in favour of <code>R in Fields()</code>.
</p>
TicketSimonKingThu, 07 Feb 2013 13:48:47 GMT
https://trac.sagemath.org/ticket/13370#comment:52
https://trac.sagemath.org/ticket/13370#comment:52
<p>
I have updated the patch, and I hope it'll make the patchbot happy. At least, all tests in sage/geometry pass. Needs review!
</p>
TicketnbruinThu, 07 Feb 2013 19:30:35 GMTstatus changed
https://trac.sagemath.org/ticket/13370#comment:53
https://trac.sagemath.org/ticket/13370#comment:53
<ul>
<li><strong>status</strong>
changed from <em>needs_review</em> to <em>positive_review</em>
</li>
</ul>
<p>
Cool stuff! Works for me, so positive review. Note that in <a class="ticket" href="https://trac.sagemath.org/ticket/13370#comment:1" title="Comment 1">1</a> you also noted the issue that is now the topic of <a class="closed ticket" href="https://trac.sagemath.org/ticket/14058" title="enhancement: Weakly reference binary operation codomains (closed: fixed)">#14058</a>.
</p>
TicketjdemeyerFri, 08 Feb 2013 08:24:34 GMTmilestone changed; reviewer set
https://trac.sagemath.org/ticket/13370#comment:54
https://trac.sagemath.org/ticket/13370#comment:54
<ul>
<li><strong>reviewer</strong>
set to <em>Nils Bruin</em>
</li>
<li><strong>milestone</strong>
changed from <em>sage-5.7</em> to <em>sage-5.8</em>
</li>
</ul>
TicketnbruinSun, 10 Feb 2013 21:08:49 GMT
https://trac.sagemath.org/ticket/13370#comment:55
https://trac.sagemath.org/ticket/13370#comment:55
<p>
Something that struck me. With the current trick,
</p>
<pre class="wiki">sage: R=ZZ.quo(7)
sage: %timeit R in Fields()
</pre><p>
is going to be pretty efficient, because the category of R reflects it's a field after the first test. However,
</p>
<pre class="wiki">sage: R=ZZ.quo(15)
sage: %timeit R in Fields()
</pre><p>
probably isn't efficient (but it probably wasn't before either), because there's no way to register it's NOT a field in its category.
</p>
<p>
I recognize that frequently asking whether a field is indeed a field will happen a lot, whereas frequently confirming something is NOT a field will probably be considerably less common, so perhaps it's not a problem. It does reflect that this solution is accomplishing something quite different from just making <code>is_Field</code> efficient without external cache.
</p>
TicketjdemeyerSun, 17 Feb 2013 22:42:33 GMTstatus changed; resolution, merged set
https://trac.sagemath.org/ticket/13370#comment:56
https://trac.sagemath.org/ticket/13370#comment:56
<ul>
<li><strong>status</strong>
changed from <em>positive_review</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>fixed</em>
</li>
<li><strong>merged</strong>
set to <em>sage-5.8.beta0</em>
</li>
</ul>
TicketjdemeyerThu, 21 Feb 2013 20:18:18 GMT
https://trac.sagemath.org/ticket/13370#comment:57
https://trac.sagemath.org/ticket/13370#comment:57
<p>
Please see <a class="closed ticket" href="https://trac.sagemath.org/ticket/14158" title="defect: _is_Field() ignores exceptions (closed: fixed)">#14158</a> for a <strong>blocker</strong> follow-up.
</p>
Ticket