Opened 9 years ago
Closed 8 years ago
#15978 closed enhancement (fixed)
Waste of time in g.edges() (acually in iterator_edges)
Reported by:  Nathann Cohen  Owned by:  

Priority:  major  Milestone:  sage6.2 
Component:  graph theory  Keywords:  
Cc:  Jernej Azarija  Merged in:  
Authors:  Nathann Cohen  Reviewers:  Jernej Azarija 
Report Upstream:  N/A  Work issues:  
Branch:  77c4df2 (Commits, GitHub, GitLab)  Commit:  77c4df2fe5eda40cd86c083d8d2e3e510efbb667 
Dependencies:  Stopgaps: 
Description (last modified by )
Jernej reported that Sage took <a lifetime> to enumerate the edges of an important graph of his. There was actually a comment in the source code which I left then forgot, saying that this particular block of code was totally awful.
Was.
With the latest release :
sage: g = graphs.CompleteGraph(3000) sage: %time _ = g.to_dictionary() CPU times: user 1.38 s, sys: 0 ns, total: 1.38 s Wall time: 1.37 s sage: %time _ = g.edges() CPU times: user 1min 24s, sys: 8 ms, total: 1min 24s Wall time: 1min 24s
With this branch :
sage: g = graphs.CompleteGraph(3000) sage: %time _ = g.to_dictionary() CPU times: user 1.38 s, sys: 0 ns, total: 1.38 s Wall time: 1.37 s sage: %time _ = g.edges() CPU times: user 6.88 s, sys: 0 ns, total: 6.88 s Wall time: 6.88 s
Which still does not make any sense.
Nathann
Change History (26)
comment:1 Changed 9 years ago by
Branch:  → u/ncohen/15978 

Commit:  → 768a63ea1d24e569a254f99b01b1f4c568198bc9 
Description:  modified (diff) 
Status:  new → needs_review 
comment:2 Changed 9 years ago by
Description:  modified (diff) 

comment:3 Changed 9 years ago by
Nathan,
the patch seems to implement the fix in the way you described yesterday in the email right?
comment:4 Changed 9 years ago by
Yepyepyepyepyep. Except I replaced the calls to "sorted" by plain python.
sage: def a(u,v): ....: return tuple(sorted((u,v))) sage: %timeit a(5,6) 1000000 loops, best of 3: 1.58 µs per loop sage: def a(u,v): ....: return (u,v) if u<=v else (v,u) sage: %timeit a(5,6) 1000000 loops, best of 3: 527 ns per loop
Nathann
comment:5 followup: 6 Changed 9 years ago by
Probably a stupid question, but: Why is CompleteGraph(3000)
a SparseGraph
?
I don't know anything about the graph code, but I did a bit of profiling out of curiosity. And it looks like nested loops such as
# SparseGraph.iterator_edges for u_int in self._cg.out_neighbors(v_int): if u_int >= v_int: for l_int in self._cg.all_arcs(v_int, u_int): ...
are a pretty redundant and not too cachefriendly way of enumerating the edges. In particular:
all_arcs_unsafe(u, v, ...)
seems to spend LOTS of time looking fortemp
such thattemp.vertex == v
, while (if I understand correctly)out_neighbors_unsafe
already had to visit the correspondingSparseGraphBTNode
. But perhaps the graph is just too dense for the hash table to do its job, and it would work just fine with a sparse graph? The test
u_int <= v_int
can also be a bit expensive, perhaps due to branch mispredictions or something like that.
comment:6 followup: 7 Changed 9 years ago by
Yoooooo !
Probably a stupid question, but: Why is
CompleteGraph(3000)
aSparseGraph
?
No good answer to that. There is a Dense Graph backend which is never used unless you ask for it, and as it is never tested I expect it to be quite buggy.
I don't know anything about the graph code, but I did a bit of profiling out of curiosity. And it looks like nested loops such as
# SparseGraph.iterator_edges for u_int in self._cg.out_neighbors(v_int): if u_int >= v_int: for l_int in self._cg.all_arcs(v_int, u_int): ...are a pretty redundant and not too cachefriendly way of enumerating the edges. In particular:
all_arcs_unsafe(u, v, ...)
seems to spend LOTS of time looking fortemp
such thattemp.vertex == v
, while (if I understand correctly)out_neighbors_unsafe
already had to visit the correspondingSparseGraphBTNode
. But perhaps the graph is just too dense for the hash table to do its job, and it would work just fine with a sparse graph?
There is a waste of time there indeed. I just never rewrote these parts... The backend code is Robert Miller's, I never touched the data structures. I wrote some backends I needed for efficient implementations, and in those implementations you will not find such waste...
I should either rewrite the default backend for graphs, or write lowerlevel code for functions like that.
I just hate labels and multiple edges.....
 The test
u_int <= v_int
can also be a bit expensive, perhaps due to branch mispredictions or something like that.
Hmmmm... I understand, but how can you do without ?
Nathann
comment:7 Changed 9 years ago by
Replying to ncohen:
 The test
u_int <= v_int
can also be a bit expensive, perhaps due to branch mispredictions or something like that.Hmmmm... I understand, but how can you do without ?
I was thinking of having out_neighbors
(or a new variant of it that would return SparseGraphBTNode
s) filter out the vertices with u_int < v_int
itslef, or seeing if by chance it happens to return a sorted list... But I have no idea (i) if it is feasible, (ii) if would be worth the pain!
comment:8 Changed 9 years ago by
Yoooo !!
Well. What I should do is try again to read and document, and rewrite this at lower level. Definitely.
I mean... You are right in what you say, it is just that I have to read and understand very undocumented code, and that to be honest I don't have performance problems myself. But I will do that next week, definitely.
Thanks.
Sorry, I'm a bit exhausted right now. Anyway have fun, good evening to you ;)
Nathann
comment:9 Changed 9 years ago by
It took me a while, but I understood that what Marc reported above is actually responsible for 99% of our worries. I will create a ticket and write a patch for this tomorrow asap. It will probably depend on this very ticket.
Nathann
comment:11 Changed 9 years ago by
I've looked at the patch and tested it and it looks good (the doctest somehow fails but I believe its not related to this change but something unrelated). Hence I give this patch a go.
Thanks Nathann for fixing this.
comment:12 Changed 9 years ago by
Status:  needs_review → positive_review 

comment:14 Changed 8 years ago by
Reviewers:  → Jernej Azarija 

comment:15 Changed 8 years ago by
Status:  positive_review → needs_work 

Traceback (most recent call last): File "/home/release/Sage/src/doc/common/builder.py", line 1474, in <module> getattr(get_builder(name), type)() File "/home/release/Sage/src/doc/common/builder.py", line 269, in _wrapper getattr(get_builder(document), 'inventory')(*args, **kwds) File "/home/release/Sage/src/doc/common/builder.py", line 485, in _wrapper x.get(99999) File "/home/release/Sage/local/lib/python/multiprocessing/pool.py", line 554, in get raise self._value OSError: [graphs ] docstring of sage.graphs.base.sparse_graph.SparseGraphBackend.iterator_edges:20: WARNING: Literal block expected; none found.
comment:16 Changed 8 years ago by
NOOOOooooooooooooooooooooooooooooooo... It missed the release :P
Okay, fixed T_T
Nathann
comment:17 Changed 8 years ago by
Commit:  768a63ea1d24e569a254f99b01b1f4c568198bc9 → 83f22e77e04b47fba65ae55adb4d80ec1c7d0a5e 

comment:18 Changed 8 years ago by
Status:  needs_work → positive_review 

comment:19 Changed 8 years ago by
Status:  positive_review → needs_work 

Doctests fail in some crystal stuff
comment:20 Changed 8 years ago by
I hate parents/elements and categories. I am not even allowed to test that two elements are equal, for if I do I get inside of their f... recursive loops.
comment:22 Changed 8 years ago by
sage: Tab = CrystalOfTableaux(['A', 3], shape = [2,1,1]) sage: g=Tab.digraph() sage: u,v = g.vertices()[:2] sage: u <= v ... RuntimeError: maximum recursion depth exceeded in cmp
What the hell can I do with that ?
Before, the following code was called :
sage: cmp(u,v) 1
comment:23 Changed 8 years ago by
Okay, here it is. I replaced the <=
with <
which changes nothing for us and does not change any doctest either, except that the crystal doctests works now.
And I wrote to sagecombinatdevel about that.
I set it back to positive_review
after having checked that all long doctests in graph/ and combinat/crystal pass.
By the way the doctests in combinat/crystals/ take a lifetime.
Nathann
comment:24 Changed 8 years ago by
Commit:  83f22e77e04b47fba65ae55adb4d80ec1c7d0a5e → 77c4df2fe5eda40cd86c083d8d2e3e510efbb667 

Branch pushed to git repo; I updated commit sha1. New commits:
77c4df2  trac #15978: Broken doctest in combinat/crystals/

comment:25 Changed 8 years ago by
Status:  needs_work → positive_review 

comment:26 Changed 8 years ago by
Branch:  u/ncohen/15978 → 77c4df2fe5eda40cd86c083d8d2e3e510efbb667 

Resolution:  → fixed 
Status:  positive_review → closed 
New commits:
trac #15978: Waste of time in g.edges()