Sage: Ticket #13450: Fix refcounting of libsingular rings
https://trac.sagemath.org/ticket/13450
<p>
While working on <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>/<a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a>, a sporadic error has occurred:
</p>
<p>
On bsd.math (hence, OS X on Intel chips), <code>sage -t sage/misc/cachefunc.pyx</code> fails with signal 11. Apparently it is fine on different machines, even though with additional patches (such as <a class="closed ticket" href="https://trac.sagemath.org/ticket/13370" title="defect: Do not cache the result of is_Field externally (closed: fixed)">#13370</a> or <a class="closed ticket" href="https://trac.sagemath.org/ticket/12876" title="enhancement: Fix element and parent classes of Hom categories to be abstract, and ... (closed: fixed)">#12876</a>) there are signal 11 in other tests and on other machines as well.
</p>
<p>
The segfault disappears when running the tests in verbose mode. It also disappears if the tests are run under gdb (but then a different error occurs, that is unrelated with the problem we are dealing with here).
</p>
<p>
Nils (see <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>) was able to track the problem down to libsingular. He found (correct me if I misunderstood):
</p>
<p>
A segfault occurs in <code>MPolynomialRing_libsingular.__init__</code> in the line:
</p>
<div class="wiki-code"><div class="code"><pre> <span class="bp">self</span><span class="o">.</span>_ring <span class="o">=</span> singular_ring_new<span class="p">(</span>base_ring<span class="p">,</span> n<span class="p">,</span> <span class="bp">self</span><span class="o">.</span>_names<span class="p">,</span> order<span class="p">)</span>
</pre></div></div><p>
Digging deeper, he came to <code>sage/libs/singular/ring.pyx</code> in the lines:
</p>
<div class="wiki-code"><div class="code"><pre> _names <span class="o">=</span> <span class="o"><</span>char<span class="o">**></span>omAlloc0<span class="p">(</span>sizeof<span class="p">(</span>char<span class="o">*</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span>names<span class="p">)))</span>
<span class="k">for</span> i <span class="kn">from</span> <span class="mi">0</span> <span class="o"><=</span> i <span class="o"><</span> n<span class="p">:</span>
_name <span class="o">=</span> names<span class="p">[</span>i<span class="p">]</span>
_names<span class="p">[</span>i<span class="p">]</span> <span class="o">=</span> omStrDup<span class="p">(</span>_name<span class="p">)</span>
</pre></div></div><p>
Strange enough, the segfault occurs in <code>omStrDup</code>.
</p>
<p>
He found several ways to work around:
</p>
<ol><li>In <code>sage/libs/singular/ring.pyx</code>, copy the strings manually, such as
<div class="wiki-code"><div class="code"><pre> <span class="k">for</span> i <span class="kn">from</span> <span class="mi">0</span> <span class="o"><=</span> i <span class="o"><</span> n<span class="p">:</span>
_name <span class="o">=</span> names<span class="p">[</span>i<span class="p">]</span>
j <span class="o">=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="o"><</span>bint<span class="o">></span> _name<span class="p">[</span>j<span class="p">]:</span>
j<span class="o">+=</span><span class="mi">1</span>
j<span class="o">+=</span><span class="mi">1</span> <span class="c">#increment to include the 0</span>
copiedname <span class="o">=</span> <span class="o"><</span>char<span class="o">*></span>omAlloc<span class="p">(</span>sizeof<span class="p">(</span>char<span class="p">)</span><span class="o">*</span><span class="p">(</span>j<span class="o">+</span>perturb<span class="p">))</span>
<span class="k">for</span> <span class="mi">0</span> <span class="o"><=</span> offset <span class="o"><</span> j<span class="p">:</span>
copiedname<span class="p">[</span>offset<span class="p">]</span> <span class="o">=</span> _name<span class="p">[</span>offset<span class="p">]</span>
_names<span class="p">[</span>i<span class="p">]</span> <span class="o">=</span> copiedname
</pre></div></div>provided that the parameter <code>perturb</code> is at least 7.
</li><li>Modify manual refcounting in <code>sage/libs/singular/ring.pyx</code> to prevent deallocation:
<div class="wiki-code"><div class="code"><pre> wrapped_ring = wrap_ring(_ring)
if wrapped_ring in ring_refcount_dict:
raise ValueError('newly created ring already in dictionary??')
<span class="gd">- ring_refcount_dict[wrapped_ring] = 1
</span><span class="gi">+ ring_refcount_dict[wrapped_ring] = 2
</span> return _ring
</pre></div></div></li><li>Use a strong cache for multivariate polynomial rings.
</li></ol><p>
Options 2. and 3. would not be nice: The aim of <a class="closed ticket" href="https://trac.sagemath.org/ticket/715" title="defect: Parents probably not reclaimed due to too much caching (closed: fixed)">#715</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/11521" title="defect: Use weak references to cache homsets (closed: fixed)">#11521</a>, <a class="closed ticket" href="https://trac.sagemath.org/ticket/12215" title="defect: Memleak in UniqueRepresentation, @cached_method (closed: fixed)">#12215</a> and <a class="closed ticket" href="https://trac.sagemath.org/ticket/12313" title="defect: Fix yet another memory leak caused by caching of coercion data (closed: fixed)">#12313</a> was to make caches for parents weak in order to prevent memory leaks.
</p>
<p>
Since the problem disappears when deallocation of libsingular rings is prevented, it seems that the bug is in the deallocation of libsingular rings. Note that the problem is <em>not</em> fixed by <a class="closed ticket" href="https://trac.sagemath.org/ticket/13145" title="defect: Sage's noncommutative rings don't always increment a refcount (closed: fixed)">#13145</a>.
</p>
<p>
I don't know if it is an upstream bug or a bug in the Sage library. Hence, no idea what to report upstream.
</p>
en-usSagehttps://trac.sagemath.org/chrome/site/logo_sagemath_trac.png
https://trac.sagemath.org/ticket/13450
Trac 1.1.6nbruinWed, 12 Sep 2012 14:59:27 GMT
https://trac.sagemath.org/ticket/13450#comment:1
https://trac.sagemath.org/ticket/13450#comment:1
<p>
duplicate of <a class="needs_work ticket" href="https://trac.sagemath.org/ticket/13447" title="defect: Make libsingular multivariate polynomial rings collectable (needs_work)">#13447</a>?
</p>
TicketSimonKingWed, 12 Sep 2012 15:45:27 GMTstatus changed
https://trac.sagemath.org/ticket/13450#comment:2
https://trac.sagemath.org/ticket/13450#comment:2
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>needs_review</em>
</li>
</ul>
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13450#comment:1" title="Comment 1">nbruin</a>:
</p>
<blockquote class="citation">
<p>
duplicate of <a class="needs_work ticket" href="https://trac.sagemath.org/ticket/13447" title="defect: Make libsingular multivariate polynomial rings collectable (needs_work)">#13447</a>?
</p>
</blockquote>
<p>
Sure.
</p>
TicketSimonKingWed, 12 Sep 2012 15:46:27 GMTstatus changed; reviewer set
https://trac.sagemath.org/ticket/13450#comment:3
https://trac.sagemath.org/ticket/13450#comment:3
<ul>
<li><strong>status</strong>
changed from <em>needs_review</em> to <em>positive_review</em>
</li>
<li><strong>reviewer</strong>
set to <em>Nils Bruin</em>
</li>
</ul>
<p>
I guess that makes it a positive review with you as reviewer, with the suggested resolution "duplicate of <a class="needs_work ticket" href="https://trac.sagemath.org/ticket/13447" title="defect: Make libsingular multivariate polynomial rings collectable (needs_work)">#13447</a> ".
</p>
TicketjdemeyerWed, 12 Sep 2012 21:19:41 GMTmilestone changed
https://trac.sagemath.org/ticket/13450#comment:4
https://trac.sagemath.org/ticket/13450#comment:4
<ul>
<li><strong>milestone</strong>
changed from <em>sage-5.4</em> to <em>sage-pending</em>
</li>
</ul>
TicketSimonKingThu, 13 Sep 2012 05:39:08 GMT
https://trac.sagemath.org/ticket/13450#comment:5
https://trac.sagemath.org/ticket/13450#comment:5
<p>
Jeroen, why "sage-pending"? It is clear that this ticket is a duplicate of <a class="needs_work ticket" href="https://trac.sagemath.org/ticket/13447" title="defect: Make libsingular multivariate polynomial rings collectable (needs_work)">#13447</a>.
</p>
TicketjdemeyerThu, 13 Sep 2012 08:58:06 GMTmilestone changed
https://trac.sagemath.org/ticket/13450#comment:6
https://trac.sagemath.org/ticket/13450#comment:6
<ul>
<li><strong>milestone</strong>
changed from <em>sage-pending</em> to <em>sage-duplicate/invalid/wontfix</em>
</li>
</ul>
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/13450#comment:5" title="Comment 5">SimonKing</a>:
</p>
<blockquote class="citation">
<p>
It is clear that this ticket is a duplicate of <a class="needs_work ticket" href="https://trac.sagemath.org/ticket/13447" title="defect: Make libsingular multivariate polynomial rings collectable (needs_work)">#13447</a>.
</p>
</blockquote>
<p>
Then the milestone should be set to sage-duplicate/invalid/wontfix.
</p>
TicketjdemeyerMon, 24 Sep 2012 19:13:07 GMTstatus changed; resolution set
https://trac.sagemath.org/ticket/13450#comment:7
https://trac.sagemath.org/ticket/13450#comment:7
<ul>
<li><strong>status</strong>
changed from <em>positive_review</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>duplicate</em>
</li>
</ul>
Ticket