Opened 3 years ago
Closed 3 years ago
#28335 closed enhancement (fixed)
Cythonize Yen_k_shortest_simple_paths and feng_k_shortest_simple_paths
Reported by:  Rajat Mittal  Owned by:  

Priority:  major  Milestone:  sage8.9 
Component:  graph theory  Keywords:  gsoc2019 
Cc:  David Coudert  Merged in:  
Authors:  Rajat Mittal  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  aac00c3 (Commits, GitHub, GitLab)  Commit:  aac00c3e0005643c0771d85d8decb431e7f13f3c 
Dependencies:  #27859  Stopgaps: 
Description
This ticket aims to implement these methods in Cython for speed up and performance gains.
Change History (28)
comment:1 Changed 3 years ago by
Authors:  → Rajat Mittal 

Branch:  → public/graphs/28335_Cythonize_Yen_Feng 
comment:2 Changed 3 years ago by
Commit:  → 12ebc7080e850f0dbe71a305448e5f4dca9c062a 

comment:3 Changed 3 years ago by
I have cythonized the Yen and Feng's methods here as can be seen from the last commit in the list of above commits.
In short I have used cython variables and cython data structures where ever possible in these methods.
Also I have named the cython methods as yen_k_shortest_simple_paths_cython and feng_k_shortest_simple_paths_cython to separate them from yen_k_shortest_simple_paths and feng_k_shortest_simple_paths for consistency and speed checks. feng_k_shortest_simple_paths Changes will be more clear once #27859 gets included.
comment:4 followup: 6 Changed 3 years ago by
The way you declare the priority queue is incorrect. This will fail if edge weights are not integers.
The list idx_to_path
stores all paths, and you never remove any. So it can become giant. Can't you instead use a dictionary and a counter to assign each new path a unique id, and remove paths that are no longer useful ? In fact, you can certainly combine idx_to_path
and heap_paths
in a unique dictionary, no ?
comment:5 Changed 3 years ago by
Commit:  12ebc7080e850f0dbe71a305448e5f4dca9c062a → c93798417e0f296a13eac1539094b12f2fa75ea0 

Branch pushed to git repo; I updated commit sha1. New commits:
c937984  Improved with using Dictionary and DOUBLE

comment:6 Changed 3 years ago by
Replying to dcoudert:
The way you declare the priority queue is incorrect. This will fail if edge weights are not integers.
The list
idx_to_path
stores all paths, and you never remove any. So it can become giant. Can't you instead use a dictionary and a counter to assign each new path a unique id, and remove paths that are no longer useful ? In fact, you can certainly combineidx_to_path
andheap_paths
in a unique dictionary, no ?
I agree with both the statements and made the changes appropriately
comment:7 Changed 3 years ago by
Seems correct.
The next step is to speed up the test hash_path not in idx_to_path.values()
, but I don't have good ideas yet on how to do that.
comment:8 Changed 3 years ago by
Commit:  c93798417e0f296a13eac1539094b12f2fa75ea0 → a72071b9fcbbeb96091a43f5ade1596acfa4b2da 

Branch pushed to git repo; I updated commit sha1. New commits:
a72071b  removal of unncessary tests

comment:9 Changed 3 years ago by
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cyth ....: on sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cytho ....: n sage: sage: G = digraphs.RandomDirectedGNP(30, .05) ....: sage: while not G.is_strongly_connected(): ....: ....: G = digraphs.RandomDirectedGNP(30, .1) ....: sage: for u, v in list(G.edges(labels=False, sort=False)): ....: ....: G.set_edge_label(u, v, randint(1, 10)) ....: sage: V = G.vertices() ....: sage: shuffle(V) ....: sage: u, v = V[:2] sage: it_Y = feng_k_shortest_simple_paths_cython(G, u, v, by_weight=True, report ....: _weight=True) sage: it_F = feng_k_shortest_simple_paths(G, u, v, by_weight=True, report_weight ....: =True) sage: for i in range(205): ....: if(it_F.next()[0]!=it_Y.next()[0]): ....: print("wrong") ....: ....: ....: sage: it_Y = yen_k_shortest_simple_paths_cython(G, u, v, by_weight=True, report_ ....: weight=True) sage: it_F = yen_k_shortest_simple_paths(G, u, v, by_weight=True, report_weight= ....: True) sage: for i in range(205): ....: if(it_F.next()[0]!=it_Y.next()[0]): ....: print("wrong") ....: ....: ....:
comment:10 Changed 3 years ago by
After thinking about the utility of the hash_path not in idx_to_path.values() test , it seems like we don't really require it as per our algorithm as we are working on graphs without multiedges hence the path will no be repeated into the heap. So this test is unnecessary here.
Also the test supporting it is shown in the above comment.
comment:11 Changed 3 years ago by
Commit:  a72071b9fcbbeb96091a43f5ade1596acfa4b2da → 8c022dd0dee9e6a0c55206d54599dda3a27611de 

Branch pushed to git repo; I updated commit sha1. New commits:
8c022dd  removing hash_paths

comment:12 Changed 3 years ago by
Commit:  8c022dd0dee9e6a0c55206d54599dda3a27611de → 8085e8c4cc407011796c9ca74250c995b772aa93 

Branch pushed to git repo; I updated commit sha1. New commits:
8085e8c  merged with sage 8.9beta6

comment:13 followup: 14 Changed 3 years ago by
I'm not fully convinced by the removal of this test. It is part of the algorithm for a reason. Consider an unweighted graph with many paths from u to v with same cost. For instance a complete graph. A same path might certainly been found several times, no ?
comment:14 Changed 3 years ago by
Replying to dcoudert:
I'm not fully convinced by the removal of this test. It is part of the algorithm for a reason. Consider an unweighted graph with many paths from u to v with same cost. For instance a complete graph. A same path might certainly been found several times, no ?
Actual algorithm never specify to do this test. I added this test thinking that a path may be repeated in the heap, but the way Yen's algorithm unfolds its not possible to add the same path twice.
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cython sage: g = graphs.CompleteGraph(5) sage: it_Y = yen_k_shortest_simple_paths_cython(g, 0, 4) sage: list(it_Y) [[0, 4], [0, 1, 4], [0, 2, 4], [0, 3, 4], [0, 3, 1, 4], [0, 3, 2, 4], [0, 2, 1, 4], [0, 2, 3, 4], [0, 1, 2, 4], [0, 1, 3, 4], [0, 1, 3, 2, 4], [0, 1, 2, 3, 4], [0, 2, 3, 1, 4], [0, 2, 1, 3, 4], [0, 3, 2, 1, 4], [0, 3, 1, 2, 4]]
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths sage: g = graphs.CompleteGraph(5) sage: it_Y = yen_k_shortest_simple_paths(g, 0, 4) sage: list(it_Y) [[0, 4], [0, 1, 4], [0, 2, 4], [0, 3, 4], [0, 1, 2, 4], [0, 1, 3, 4], [0, 2, 1, 4], [0, 2, 3, 4], [0, 3, 1, 4], [0, 3, 2, 4], [0, 1, 2, 3, 4], [0, 1, 3, 2, 4], [0, 2, 1, 3, 4], [0, 2, 3, 1, 4], [0, 3, 1, 2, 4], [0, 3, 2, 1, 4]]
comment:15 Changed 3 years ago by
sage: g = g.to_directed() sage: show(g) Launched png viewer for Graphics object consisting of 30 graphics primitives sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cython sage: it_F = feng_k_shortest_simple_paths_cython(g, 0, 4) sage: list(it_F) [[0, 4], [0, 3, 4], [0, 2, 4], [0, 1, 4], [0, 1, 3, 4], [0, 1, 2, 4], [0, 2, 3, 4], [0, 2, 1, 4], [0, 3, 2, 4], [0, 3, 1, 4], [0, 3, 1, 2, 4], [0, 3, 2, 1, 4], [0, 2, 1, 3, 4], [0, 2, 3, 1, 4], [0, 1, 2, 3, 4], [0, 1, 3, 2, 4]]
comment:16 Changed 3 years ago by
Status:  new → needs_review 

comment:17 Changed 3 years ago by
Commit:  8085e8c4cc407011796c9ca74250c995b772aa93 → 5e460d7a61c9087084c8dd1c411ac81b659d4492 

Branch pushed to git repo; I updated commit sha1. New commits:
5e460d7  trac #28335: review edit Yen

comment:18 Changed 3 years ago by
I did some changes in Yen. I think it is cleaner this way. I have not touched Feng. You can certainly try to follow what I did to start improving it.
comment:19 Changed 3 years ago by
Commit:  5e460d7a61c9087084c8dd1c411ac81b659d4492 → 2f41b460be24ebeff833ff05131374c9e1027c5c 

Branch pushed to git repo; I updated commit sha1. New commits:
2f41b46  Cythonization and edits in Feng's

comment:20 Changed 3 years ago by
Following are the comparisons results between: There's a slight improvement observed.
sage: g = DiGraph(graphs.Grid2dGraph(10, 10)) sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cyth ....: on sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cytho ....: n sage: p = yen_k_shortest_simple_paths(g, (0, 0), (9, 9)) sage: %timeit [p.next() for i in range(1000)] 1 loop, best of 3: 1.25 s per loop sage: p = yen_k_shortest_simple_paths_cython(g, (0, 0), (9, 9)) sage: %timeit [p.next() for i in range(1000)] 1 loop, best of 3: 1.24 s per loop sage: p = feng_k_shortest_simple_paths(g, (0, 0), (9, 9)) sage: %timeit [p.next() for i in range(5000)] 1 loop, best of 3: 4.62 s per loop sage: p = feng_k_shortest_simple_paths_cython(g, (0, 0), (9, 9)) sage: %timeit [p.next() for i in range(5000)] 1 loop, best of 3: 4.72 s per loop sage: g = DiGraph() sage: for i in range(120): ....: ....: g.add_edge(i,i+1,i%5) ....: sage: for i in range(30, 40, 1): ....: ....: g.add_edge(i,120,i%2) ....: sage: %timeit list(yen_k_shortest_simple_paths_directed(g, 0, 120, by_weight=Tr ....: ue)) sage: %timeit list(yen_k_shortest_simple_paths(g, 0, 120, by_weight=True)) 10 loops, best of 3: 46.6 ms per loop sage: %timeit list(yen_k_shortest_simple_paths_cython(g, 0, 120, by_weight=True ....: )) 10 loops, best of 3: 45.6 ms per loop sage: %timeit list(feng_k_shortest_simple_paths_cython(g, 0, 120, by_weight=Tru ....: e)) 100 loops, best of 3: 7.62 ms per loop sage: %timeit list(feng_k_shortest_simple_paths(g, 0, 120, by_weight=True)) 100 loops, best of 3: 7.74 ms per loop
comment:21 Changed 3 years ago by
Commit:  2f41b460be24ebeff833ff05131374c9e1027c5c → 4f4c610da59fb1f19aaee48154e5f9f0efd45368 

Branch pushed to git repo; I updated commit sha1. New commits:
4f4c610  minor edit

comment:22 Changed 3 years ago by
Now that #27859 has been merged, it's much easier to check the code ;)
Instead of creating new methods, I propose to replace the code of the previous Yen's method by the code of the Cython version, and to do the same for Feng.
comment:23 Changed 3 years ago by
Commit:  4f4c610da59fb1f19aaee48154e5f9f0efd45368 → a9344e892c288b3b1ad22c67aa6d8725353f1500 

Branch pushed to git repo; I updated commit sha1. New commits:
a9344e8  upgraded to Cython methods for Yen and Feng

comment:24 Changed 3 years ago by
Status:  needs_review → positive_review 

LGTM. We have a better code for Yen, and I propose to postpone to another ticket possible improvements for Feng (it should be much faster, but it's not).
comment:25 Changed 3 years ago by
Commit:  a9344e892c288b3b1ad22c67aa6d8725353f1500 → aac00c3e0005643c0771d85d8decb431e7f13f3c 

Status:  positive_review → needs_review 
Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:
aac00c3  merged with sage 8.9beta7

comment:26 Changed 3 years ago by
I think it was necessary to rebase this tkt on 8.9beta7 otherwise some merge conflict might happen. Can you please set it to positive review again.
comment:27 Changed 3 years ago by
Status:  needs_review → positive_review 

It was not necessary ! Firstly, I tested the ticket over beta 7, and secondly this does not prevent possible conflicts with other tickets (depends on the order in which the tickets are merged for next release).
comment:28 Changed 3 years ago by
Branch:  public/graphs/28335_Cythonize_Yen_Feng → aac00c3e0005643c0771d85d8decb431e7f13f3c 

Resolution:  → fixed 
Status:  positive_review → closed 
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
adding comments
trac #27859: fix merge conflict with 8.9.beta5
trac #27859: add test on random digraphs
made feng faster
some improvement
trac #27859: review edit in Yen
trac #27859: review edit in Feng
Merge branch 'public/graphs/27859_enumerate_paths' of git://trac.sagemath.org/sage into public/graphs/27859_enumerate_paths
removing extra newline
cythonizing Yen and Feng