Opened 13 years ago
Closed 13 years ago
#6578 closed defect (fixed)
[with patch, positive review] fast subgraphs by building the graph instead of deleting things
Reported by: | Jason Grout | Owned by: | Robert Miller |
---|---|---|---|
Priority: | major | Milestone: | sage-4.1.1 |
Component: | graph theory | Keywords: | |
Cc: | Robert Miller | Merged in: | Sage 4.1.1.alpha1 |
Authors: | Jason Grout | Reviewers: | Robert Miller, Minh Van Nguyen |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Currently, to create a subgraph, Sage copies the graph and then deletes everything not specified. This is very slow if you just want a small part of a large graph.
Attachments (2)
Change History (10)
comment:1 Changed 13 years ago by
Changed 13 years ago by
Attachment: | trac-6578-subgraph-refactoring.patch added |
---|
comment:2 Changed 13 years ago by
Cc: | Robert Miller added |
---|---|
Summary: | fast subgraphs by building the graph instead of deleting things → [with patch, needs review] fast subgraphs by building the graph instead of deleting things |
Timing comparison:
sage: g=graphs.PathGraph(100000) sage: %time g.subgraph(range(20), algorithm='add') CPU times: user 0.61 s, sys: 0.01 s, total: 0.62 s Wall time: 0.68 s Subgraph of (Path Graph): Graph on 20 vertices sage: %time g.subgraph(range(20), algorithm='delete') # the old algorithm CPU times: user 3.96 s, sys: 0.04 s, total: 4.00 s Wall time: 4.15 s Subgraph of (Path Graph): Graph on 20 vertices
comment:3 Changed 13 years ago by
Jason,
Do you have any examples where delete
is actually faster than add
? I ask because what this does is add every edge, and then delete a bunch of them. I doubt it's ever faster.
comment:4 follow-up: 5 Changed 13 years ago by
Most of the time delete is faster than add; that's why I set the factor so that if we want more than 5% of the vertices, it does delete.
I think delete copies the graph, which doesn't *add* every edge, but probably just copies the edge dictionary, which should be *way* faster.
The tests I was taking my data from was doing
sage: g=graphs.PathGraph(1000)
and then doing g.subgraph(range(i)
using each of the algorithms. The breakeven point seemed to be between 50 and 100. I also did this with the complete graph, and got similar results.
comment:5 Changed 13 years ago by
Reviewers: | → Robert Miller |
---|---|
Summary: | [with patch, needs review] fast subgraphs by building the graph instead of deleting things → [with patch, positive review] fast subgraphs by building the graph instead of deleting things |
Replying to jason:
I think delete copies the graph, which doesn't *add* every edge, but probably just copies the edge dictionary, which should be *way* faster.
What is this opinion based on? If it's not an inplace subgraph then it creates a copy, which calls networkx_graph, which calls NX's copy, which has the following lines:
for e in self.edges_iter(): H.add_edge(e)
So we're definitely not using the adjacency dictionary there. The crossover between add and delete you observed above is most likely due to overhead from the fact that calling Sage's add_edge is using several layers of Python functions.
I expect this to change drastically once the graph backends are optimized. This may be a while in the future, but you should keep this in mind for when that does eventually happen. In general, this patch is an improvement, since it provides both options, but thinking in terms of optimization at this stage is going to result in wasted effort, since the floor is about to drop out from underneath any current timings.
Anyway, everything looks good here (especially, applies to 4.1.1.alpha0 and passes tests), so let's go ahead and merge it!
comment:6 Changed 13 years ago by
The opinion was based on thinking of the most logical way to implement it (hence the "I think") :). Thanks for looking into the code.
I'm dreaming of the day that this optimization and choice is obsolete! It would be *so* nice!
comment:7 Changed 13 years ago by
Reviewers: | Robert Miller → Robert Miller, Minh Van Nguyen |
---|
The patch trac_6578-referee.patch
adds a missing double colon "::". It also deletes two sets of redundant double colons.
comment:8 Changed 13 years ago by
Merged in: | → Sage 4.1.1.alpha1 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
This patch refactors the subgraph code into two functions, and then tries to intelligently choose between building the graph from scratch and deleting vertices and edges from a copy of the graph.