Sage: Ticket #15978: Waste of time in g.edges() (acually in iterator_edges)
https://trac.sagemath.org/ticket/15978
<p>
Jernej reported that Sage took <a lifetime> to enumerate the edges of an important graph of his. There was actually a comment in the source code which I left then forgot, saying that this particular block of code was totally awful.
</p>
<p>
Was.
</p>
<p>
With the latest release :
</p>
<pre class="wiki">sage: g = graphs.CompleteGraph(3000)
sage: %time _ = g.to_dictionary()
CPU times: user 1.38 s, sys: 0 ns, total: 1.38 s
Wall time: 1.37 s
sage: %time _ = g.edges()
CPU times: user 1min 24s, sys: 8 ms, total: 1min 24s
Wall time: 1min 24s
</pre><p>
With this branch :
</p>
<pre class="wiki">sage: g = graphs.CompleteGraph(3000)
sage: %time _ = g.to_dictionary()
CPU times: user 1.38 s, sys: 0 ns, total: 1.38 s
Wall time: 1.37 s
sage: %time _ = g.edges()
CPU times: user 6.88 s, sys: 0 ns, total: 6.88 s
Wall time: 6.88 s
</pre><p>
Which still does not make any sense.
</p>
<p>
Nathann
</p>
en-usSagehttps://trac.sagemath.org/chrome/site/logo_sagemath_trac.png
https://trac.sagemath.org/ticket/15978
Trac 1.1.6ncohenThu, 20 Mar 2014 10:53:36 GMTstatus, description changed; commit, branch set
https://trac.sagemath.org/ticket/15978#comment:1
https://trac.sagemath.org/ticket/15978#comment:1
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>needs_review</em>
</li>
<li><strong>commit</strong>
set to <em>768a63ea1d24e569a254f99b01b1f4c568198bc9</em>
</li>
<li><strong>description</strong>
modified (<a href="/ticket/15978?action=diff&version=1">diff</a>)
</li>
<li><strong>branch</strong>
set to <em>u/ncohen/15978</em>
</li>
</ul>
<p>
New commits:
</p>
<table class="wiki">
<tr><td><a class="ext-link" href="http://git.sagemath.org/sage.git/commit/?id=768a63ea1d24e569a254f99b01b1f4c568198bc9"><span class="icon"></span>768a63e</a></td><td><code>trac #15978: Waste of time in g.edges()</code>
</td></tr></table>
TicketncohenThu, 20 Mar 2014 12:04:50 GMTdescription changed
https://trac.sagemath.org/ticket/15978#comment:2
https://trac.sagemath.org/ticket/15978#comment:2
<ul>
<li><strong>description</strong>
modified (<a href="/ticket/15978?action=diff&version=2">diff</a>)
</li>
</ul>
TicketaziThu, 20 Mar 2014 15:33:21 GMT
https://trac.sagemath.org/ticket/15978#comment:3
https://trac.sagemath.org/ticket/15978#comment:3
<p>
Nathan,
</p>
<p>
the patch seems to implement the fix in the way you described yesterday in the email right?
</p>
TicketncohenThu, 20 Mar 2014 15:39:08 GMT
https://trac.sagemath.org/ticket/15978#comment:4
https://trac.sagemath.org/ticket/15978#comment:4
<p>
Yepyepyepyepyep. Except I replaced the calls to "sorted" by plain python.
</p>
<pre class="wiki">sage: def a(u,v):
....: return tuple(sorted((u,v)))
sage: %timeit a(5,6)
1000000 loops, best of 3: 1.58 µs per loop
sage: def a(u,v):
....: return (u,v) if u<=v else (v,u)
sage: %timeit a(5,6)
1000000 loops, best of 3: 527 ns per loop
</pre><p>
Nathann
</p>
TicketmmezzarobbaSat, 22 Mar 2014 09:48:42 GMT
https://trac.sagemath.org/ticket/15978#comment:5
https://trac.sagemath.org/ticket/15978#comment:5
<p>
Probably a stupid question, but: Why is <code>CompleteGraph(3000)</code> a <code>SparseGraph</code>?
</p>
<p>
I don't know anything about the graph code, but I did a bit of profiling out of curiosity. And it looks like nested loops such as
</p>
<pre class="wiki"># SparseGraph.iterator_edges
for u_int in self._cg.out_neighbors(v_int):
if u_int >= v_int:
for l_int in self._cg.all_arcs(v_int, u_int):
...
</pre><p>
are a pretty redundant and not too cache-friendly way of enumerating the edges. In particular:
</p>
<ul><li><code>all_arcs_unsafe(u, v, ...)</code> seems to spend LOTS of time looking for <code>temp</code> such that <code>temp.vertex == v</code>, while (if I understand correctly) <code>out_neighbors_unsafe</code> already had to visit the corresponding <code>SparseGraphBTNode</code>. But perhaps the graph is just too dense for the hash table to do its job, and it would work just fine with a sparse graph?
</li><li>The test <code>u_int <= v_int</code> can also be a bit expensive, perhaps due to branch mispredictions or something like that.
</li></ul>
TicketncohenSat, 22 Mar 2014 10:13:36 GMT
https://trac.sagemath.org/ticket/15978#comment:6
https://trac.sagemath.org/ticket/15978#comment:6
<p>
Yoooooo !
</p>
<blockquote class="citation">
<p>
Probably a stupid question, but: Why is <code>CompleteGraph(3000)</code> a <code>SparseGraph</code>?
</p>
</blockquote>
<p>
No good answer to that. There is a Dense Graph backend which is never used unless you ask for it, and as it is never tested I expect it to be quite buggy.
</p>
<blockquote class="citation">
<p>
I don't know anything about the graph code, but I did a bit of profiling out of curiosity. And it looks like nested loops such as
</p>
<pre class="wiki"># SparseGraph.iterator_edges
for u_int in self._cg.out_neighbors(v_int):
if u_int >= v_int:
for l_int in self._cg.all_arcs(v_int, u_int):
...
</pre><p>
are a pretty redundant and not too cache-friendly way of enumerating the edges. In particular:
</p>
<ul><li><code>all_arcs_unsafe(u, v, ...)</code> seems to spend LOTS of time looking for <code>temp</code> such that <code>temp.vertex == v</code>, while (if I understand correctly) <code>out_neighbors_unsafe</code> already had to visit the corresponding <code>SparseGraphBTNode</code>. But perhaps the graph is just too dense for the hash table to do its job, and it would work just fine with a sparse graph?
</li></ul></blockquote>
<p>
There is a waste of time there indeed. I just never re-wrote these parts... The backend code is Robert Miller's, I never touched the data structures. I wrote some backends I needed for efficient implementations, and in those implementations you will not find such waste...
</p>
<p>
I should either rewrite the default backend for graphs, or write lower-level code for functions like that.
</p>
<p>
I just hate labels and multiple edges.....
</p>
<blockquote class="citation">
<ul><li>The test <code>u_int <= v_int</code> can also be a bit expensive, perhaps due to branch mispredictions or something like that.
</li></ul></blockquote>
<p>
Hmmmm... I understand, but how can you do without ?
</p>
<p>
Nathann
</p>
TicketmmezzarobbaSat, 22 Mar 2014 12:37:41 GMT
https://trac.sagemath.org/ticket/15978#comment:7
https://trac.sagemath.org/ticket/15978#comment:7
<p>
Replying to <a class="ticket" href="https://trac.sagemath.org/ticket/15978#comment:6" title="Comment 6">ncohen</a>:
</p>
<blockquote class="citation">
<blockquote class="citation">
<ul><li>The test <code>u_int <= v_int</code> can also be a bit expensive, perhaps due to branch mispredictions or something like that.
</li></ul></blockquote>
<p>
Hmmmm... I understand, but how can you do without ?
</p>
</blockquote>
<p>
I was thinking of having <code>out_neighbors</code> (or a new variant of it that would return <code>SparseGraphBTNode</code>s) filter out the vertices with <code>u_int < v_int</code> itslef, or seeing if by chance it happens to return a sorted list... But I have no idea (i) if it is feasible, (ii) if would be worth the pain!
</p>
TicketncohenSat, 22 Mar 2014 22:49:55 GMT
https://trac.sagemath.org/ticket/15978#comment:8
https://trac.sagemath.org/ticket/15978#comment:8
<p>
Yoooo !!
</p>
<p>
Well. What I should do is try again to read and document, and rewrite this at lower level. Definitely.
</p>
<p>
I mean... You are right in what you say, it is just that I have to read and understand very undocumented code, and that to be honest I don't have performance problems myself. But I will do that next week, definitely.
</p>
<p>
Thanks.
</p>
<p>
Sorry, I'm a bit exhausted right now. Anyway have fun, good evening to you ;-)
</p>
<p>
Nathann
</p>
TicketncohenSun, 23 Mar 2014 19:09:55 GMT
https://trac.sagemath.org/ticket/15978#comment:9
https://trac.sagemath.org/ticket/15978#comment:9
<p>
It took me a while, but I understood that what Marc reported above is actually responsible for 99% of our worries. I will create a ticket and write a patch for this tomorrow asap. It will probably depend on this very ticket.
</p>
<p>
Nathann
</p>
TicketncohenMon, 24 Mar 2014 14:20:42 GMT
https://trac.sagemath.org/ticket/15978#comment:10
https://trac.sagemath.org/ticket/15978#comment:10
<p>
Done in <a class="closed ticket" href="https://trac.sagemath.org/ticket/16005" title="enhancement: Waste of time in iterator_edges 2 (closed: fixed)">#16005</a>.
</p>
<p>
Nathann
</p>
TicketaziMon, 24 Mar 2014 18:34:51 GMT
https://trac.sagemath.org/ticket/15978#comment:11
https://trac.sagemath.org/ticket/15978#comment:11
<p>
I've looked at the patch and tested it and it looks good (the doctest somehow fails but I believe its not related to this change but something unrelated). Hence I give this patch a go.
</p>
<p>
Thanks Nathann for fixing this.
</p>
TicketaziMon, 24 Mar 2014 18:35:07 GMTstatus changed
https://trac.sagemath.org/ticket/15978#comment:12
https://trac.sagemath.org/ticket/15978#comment:12
<ul>
<li><strong>status</strong>
changed from <em>needs_review</em> to <em>positive_review</em>
</li>
</ul>
TicketvbraunMon, 31 Mar 2014 13:20:49 GMT
https://trac.sagemath.org/ticket/15978#comment:13
https://trac.sagemath.org/ticket/15978#comment:13
<p>
Please fill in reviewer name
</p>
TicketncohenMon, 31 Mar 2014 13:21:47 GMTreviewer set
https://trac.sagemath.org/ticket/15978#comment:14
https://trac.sagemath.org/ticket/15978#comment:14
<ul>
<li><strong>reviewer</strong>
set to <em>Jernej Azarija</em>
</li>
</ul>
TicketvbraunMon, 31 Mar 2014 16:26:25 GMTstatus changed
https://trac.sagemath.org/ticket/15978#comment:15
https://trac.sagemath.org/ticket/15978#comment:15
<ul>
<li><strong>status</strong>
changed from <em>positive_review</em> to <em>needs_work</em>
</li>
</ul>
<pre class="wiki">Traceback (most recent call last):
File "/home/release/Sage/src/doc/common/builder.py", line 1474, in <module>
getattr(get_builder(name), type)()
File "/home/release/Sage/src/doc/common/builder.py", line 269, in _wrapper
getattr(get_builder(document), 'inventory')(*args, **kwds)
File "/home/release/Sage/src/doc/common/builder.py", line 485, in _wrapper
x.get(99999)
File "/home/release/Sage/local/lib/python/multiprocessing/pool.py", line 554, in get
raise self._value
OSError: [graphs ] docstring of sage.graphs.base.sparse_graph.SparseGraphBackend.iterator_edges:20: WARNING: Literal block expected; none found.
</pre>
TicketncohenTue, 01 Apr 2014 08:23:17 GMT
https://trac.sagemath.org/ticket/15978#comment:16
https://trac.sagemath.org/ticket/15978#comment:16
<p>
NOOOOooooooooooooooooooooooooooooooo... It missed the release <code>:-P</code>
</p>
<p>
Okay, fixed <code>T_T</code>
</p>
<p>
Nathann
</p>
TicketgitTue, 01 Apr 2014 08:23:30 GMTcommit changed
https://trac.sagemath.org/ticket/15978#comment:17
https://trac.sagemath.org/ticket/15978#comment:17
<ul>
<li><strong>commit</strong>
changed from <em>768a63ea1d24e569a254f99b01b1f4c568198bc9</em> to <em>83f22e77e04b47fba65ae55adb4d80ec1c7d0a5e</em>
</li>
</ul>
<p>
Branch pushed to git repo; I updated commit sha1. New commits:
</p>
<table class="wiki">
<tr><td><a class="ext-link" href="http://git.sagemath.org/sage.git/commit/?id=c8c46ca854fcd989d601c4c496130a83ade64dc0"><span class="icon"></span>c8c46ca</a></td><td><code>trac #15978: Rebase on 6.2.beta5</code>
</td></tr><tr><td><a class="ext-link" href="http://git.sagemath.org/sage.git/commit/?id=83f22e77e04b47fba65ae55adb4d80ec1c7d0a5e"><span class="icon"></span>83f22e7</a></td><td><code>trac #15978: Broken doc</code>
</td></tr></table>
TicketncohenTue, 01 Apr 2014 08:29:00 GMTstatus changed
https://trac.sagemath.org/ticket/15978#comment:18
https://trac.sagemath.org/ticket/15978#comment:18
<ul>
<li><strong>status</strong>
changed from <em>needs_work</em> to <em>positive_review</em>
</li>
</ul>
TicketvbraunTue, 01 Apr 2014 13:45:39 GMTstatus changed
https://trac.sagemath.org/ticket/15978#comment:19
https://trac.sagemath.org/ticket/15978#comment:19
<ul>
<li><strong>status</strong>
changed from <em>positive_review</em> to <em>needs_work</em>
</li>
</ul>
<p>
Doctests fail in some crystal stuff
</p>
TicketncohenTue, 01 Apr 2014 14:28:49 GMT
https://trac.sagemath.org/ticket/15978#comment:20
https://trac.sagemath.org/ticket/15978#comment:20
<p>
I hate parents/elements and categories. I am not even allowed to test that two elements are equal, for if I do I get inside of their f... recursive loops.
</p>
TicketncohenTue, 01 Apr 2014 14:32:34 GMT
https://trac.sagemath.org/ticket/15978#comment:21
https://trac.sagemath.org/ticket/15978#comment:21
<p>
not to test equality but COMPARE.
</p>
TicketncohenTue, 01 Apr 2014 14:35:29 GMT
https://trac.sagemath.org/ticket/15978#comment:22
https://trac.sagemath.org/ticket/15978#comment:22
<pre class="wiki">sage: Tab = CrystalOfTableaux(['A', 3], shape = [2,1,1])
sage: g=Tab.digraph()
sage: u,v = g.vertices()[:2]
sage: u <= v
...
RuntimeError: maximum recursion depth exceeded in cmp
</pre><p>
What the hell can I do with that ?
</p>
<p>
Before, the following code was called :
</p>
<pre class="wiki">sage: cmp(u,v)
-1
</pre>
TicketncohenTue, 01 Apr 2014 14:57:17 GMT
https://trac.sagemath.org/ticket/15978#comment:23
https://trac.sagemath.org/ticket/15978#comment:23
<p>
Okay, here it is. I replaced the <code><=</code> with <code><</code> which changes nothing for us and does not change any doctest either, except that the crystal doctests works now.
</p>
<p>
And I wrote to sage-combinat-devel about that.
</p>
<p>
I set it back to <code>positive_review</code> after having checked that all long doctests in graph/ and combinat/crystal pass.
</p>
<p>
By the way the doctests in combinat/crystals/ take a lifetime.
</p>
<p>
Nathann
</p>
TicketgitTue, 01 Apr 2014 14:57:29 GMTcommit changed
https://trac.sagemath.org/ticket/15978#comment:24
https://trac.sagemath.org/ticket/15978#comment:24
<ul>
<li><strong>commit</strong>
changed from <em>83f22e77e04b47fba65ae55adb4d80ec1c7d0a5e</em> to <em>77c4df2fe5eda40cd86c083d8d2e3e510efbb667</em>
</li>
</ul>
<p>
Branch pushed to git repo; I updated commit sha1. New commits:
</p>
<table class="wiki">
<tr><td><a class="ext-link" href="http://git.sagemath.org/sage.git/commit/?id=77c4df2fe5eda40cd86c083d8d2e3e510efbb667"><span class="icon"></span>77c4df2</a></td><td><code>trac #15978: Broken doctest in combinat/crystals/</code>
</td></tr></table>
TicketncohenTue, 01 Apr 2014 14:58:26 GMTstatus changed
https://trac.sagemath.org/ticket/15978#comment:25
https://trac.sagemath.org/ticket/15978#comment:25
<ul>
<li><strong>status</strong>
changed from <em>needs_work</em> to <em>positive_review</em>
</li>
</ul>
TicketvbraunFri, 04 Apr 2014 15:55:46 GMTstatus, branch changed; resolution set
https://trac.sagemath.org/ticket/15978#comment:26
https://trac.sagemath.org/ticket/15978#comment:26
<ul>
<li><strong>status</strong>
changed from <em>positive_review</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>fixed</em>
</li>
<li><strong>branch</strong>
changed from <em>u/ncohen/15978</em> to <em>77c4df2fe5eda40cd86c083d8d2e3e510efbb667</em>
</li>
</ul>
Ticket