Opened 3 years ago

Closed 3 years ago

#28335 closed enhancement (fixed)

Cythonize Yen_k_shortest_simple_paths and feng_k_shortest_simple_paths

Reported by: Rajat Mittal Owned by:
Priority: major Milestone: sage-8.9
Component: graph theory Keywords: gsoc2019
Cc: David Coudert Merged in:
Authors: Rajat Mittal Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: aac00c3 (Commits, GitHub, GitLab) Commit: aac00c3e0005643c0771d85d8decb431e7f13f3c
Dependencies: #27859 Stopgaps:

Status badges

Description

This ticket aims to implement these methods in Cython for speed up and performance gains.

Change History (28)

comment:1 Changed 3 years ago by Rajat Mittal

Authors: Rajat Mittal
Branch: public/graphs/28335_Cythonize_Yen_Feng

comment:2 Changed 3 years ago by git

Commit: 12ebc7080e850f0dbe71a305448e5f4dca9c062a

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

7871701adding comments
7c9bc30trac #27859: fix merge conflict with 8.9.beta5
3ae912dtrac #27859: add test on random digraphs
82015e2made feng faster
4bdbe63some improvement
fabe87btrac #27859: review edit in Yen
3a2aca3trac #27859: review edit in Feng
034cb7fMerge branch 'public/graphs/27859_enumerate_paths' of git://trac.sagemath.org/sage into public/graphs/27859_enumerate_paths
fcfa3e6removing extra newline
12ebc70cythonizing Yen and Feng

comment:3 Changed 3 years ago by Rajat Mittal

I have cythonized the Yen and Feng's methods here as can be seen from the last commit in the list of above commits.

In short I have used cython variables and cython data structures where ever possible in these methods.

Also I have named the cython methods as yen_k_shortest_simple_paths_cython and feng_k_shortest_simple_paths_cython to separate them from yen_k_shortest_simple_paths and feng_k_shortest_simple_paths for consistency and speed checks. feng_k_shortest_simple_paths Changes will be more clear once #27859 gets included.

Last edited 3 years ago by Rajat Mittal (previous) (diff)

comment:4 Changed 3 years ago by David Coudert

The way you declare the priority queue is incorrect. This will fail if edge weights are not integers.

The list idx_to_path stores all paths, and you never remove any. So it can become giant. Can't you instead use a dictionary and a counter to assign each new path a unique id, and remove paths that are no longer useful ? In fact, you can certainly combine idx_to_path and heap_paths in a unique dictionary, no ?

comment:5 Changed 3 years ago by git

Commit: 12ebc7080e850f0dbe71a305448e5f4dca9c062ac93798417e0f296a13eac1539094b12f2fa75ea0

Branch pushed to git repo; I updated commit sha1. New commits:

c937984Improved with using Dictionary and DOUBLE

comment:6 in reply to:  4 Changed 3 years ago by Rajat Mittal

Replying to dcoudert:

The way you declare the priority queue is incorrect. This will fail if edge weights are not integers.

The list idx_to_path stores all paths, and you never remove any. So it can become giant. Can't you instead use a dictionary and a counter to assign each new path a unique id, and remove paths that are no longer useful ? In fact, you can certainly combine idx_to_path and heap_paths in a unique dictionary, no ?

I agree with both the statements and made the changes appropriately

comment:7 Changed 3 years ago by David Coudert

Seems correct. The next step is to speed up the test hash_path not in idx_to_path.values(), but I don't have good ideas yet on how to do that.

comment:8 Changed 3 years ago by git

Commit: c93798417e0f296a13eac1539094b12f2fa75ea0a72071b9fcbbeb96091a43f5ade1596acfa4b2da

Branch pushed to git repo; I updated commit sha1. New commits:

a72071bremoval of unncessary tests

comment:9 Changed 3 years ago by Rajat Mittal

sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cyth
....: on
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cytho
....: n
sage: sage: G = digraphs.RandomDirectedGNP(30, .05)
....:         sage: while not G.is_strongly_connected():
....:         ....:     G = digraphs.RandomDirectedGNP(30, .1)
....:         sage: for u, v in list(G.edges(labels=False, sort=False)):
....:         ....:     G.set_edge_label(u, v, randint(1, 10))
....:         sage: V = G.vertices()
....:         sage: shuffle(V)
....:         sage: u, v = V[:2]
sage: it_Y = feng_k_shortest_simple_paths_cython(G, u, v, by_weight=True, report
....: _weight=True)
sage: it_F = feng_k_shortest_simple_paths(G, u, v, by_weight=True, report_weight
....: =True)
sage: for i in range(205):
....:     if(it_F.next()[0]!=it_Y.next()[0]):
....:         print("wrong")
....:         
....:     
....: 
sage: it_Y = yen_k_shortest_simple_paths_cython(G, u, v, by_weight=True, report_
....: weight=True)
sage: it_F = yen_k_shortest_simple_paths(G, u, v, by_weight=True, report_weight=
....: True)
sage: for i in range(205):
....:     if(it_F.next()[0]!=it_Y.next()[0]):
....:         print("wrong")
....:         
....:     
....:

comment:10 Changed 3 years ago by Rajat Mittal

After thinking about the utility of the hash_path not in idx_to_path.values() test , it seems like we don't really require it as per our algorithm as we are working on graphs without multiedges hence the path will no be repeated into the heap. So this test is unnecessary here.

Also the test supporting it is shown in the above comment.

comment:11 Changed 3 years ago by git

Commit: a72071b9fcbbeb96091a43f5ade1596acfa4b2da8c022dd0dee9e6a0c55206d54599dda3a27611de

Branch pushed to git repo; I updated commit sha1. New commits:

8c022ddremoving hash_paths

comment:12 Changed 3 years ago by git

Commit: 8c022dd0dee9e6a0c55206d54599dda3a27611de8085e8c4cc407011796c9ca74250c995b772aa93

Branch pushed to git repo; I updated commit sha1. New commits:

8085e8cmerged with sage 8.9beta6

comment:13 Changed 3 years ago by David Coudert

I'm not fully convinced by the removal of this test. It is part of the algorithm for a reason. Consider an unweighted graph with many paths from u to v with same cost. For instance a complete graph. A same path might certainly been found several times, no ?

comment:14 in reply to:  13 Changed 3 years ago by Rajat Mittal

Replying to dcoudert:

I'm not fully convinced by the removal of this test. It is part of the algorithm for a reason. Consider an unweighted graph with many paths from u to v with same cost. For instance a complete graph. A same path might certainly been found several times, no ?

Actual algorithm never specify to do this test. I added this test thinking that a path may be repeated in the heap, but the way Yen's algorithm unfolds its not possible to add the same path twice.

sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cython
sage: g = graphs.CompleteGraph(5)
sage: it_Y = yen_k_shortest_simple_paths_cython(g, 0, 4)
sage: list(it_Y)
[[0, 4],
 [0, 1, 4],
 [0, 2, 4],
 [0, 3, 4],
 [0, 3, 1, 4],
 [0, 3, 2, 4],
 [0, 2, 1, 4],
 [0, 2, 3, 4],
 [0, 1, 2, 4],
 [0, 1, 3, 4],
 [0, 1, 3, 2, 4],
 [0, 1, 2, 3, 4],
 [0, 2, 3, 1, 4],
 [0, 2, 1, 3, 4],
 [0, 3, 2, 1, 4],
 [0, 3, 1, 2, 4]]

sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths
sage: g = graphs.CompleteGraph(5)
sage: it_Y = yen_k_shortest_simple_paths(g, 0, 4)
sage: list(it_Y)
[[0, 4],
 [0, 1, 4],
 [0, 2, 4],
 [0, 3, 4],
 [0, 1, 2, 4],
 [0, 1, 3, 4],
 [0, 2, 1, 4],
 [0, 2, 3, 4],
 [0, 3, 1, 4],
 [0, 3, 2, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 3, 2, 4],
 [0, 2, 1, 3, 4],
 [0, 2, 3, 1, 4],
 [0, 3, 1, 2, 4],
 [0, 3, 2, 1, 4]]

comment:15 Changed 3 years ago by Rajat Mittal

sage: g = g.to_directed()
sage: show(g)
Launched png viewer for Graphics object consisting of 30 graphics primitives
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cython
sage: it_F = feng_k_shortest_simple_paths_cython(g, 0, 4)
sage: list(it_F)
[[0, 4],
 [0, 3, 4],
 [0, 2, 4],
 [0, 1, 4],
 [0, 1, 3, 4],
 [0, 1, 2, 4],
 [0, 2, 3, 4],
 [0, 2, 1, 4],
 [0, 3, 2, 4],
 [0, 3, 1, 4],
 [0, 3, 1, 2, 4],
 [0, 3, 2, 1, 4],
 [0, 2, 1, 3, 4],
 [0, 2, 3, 1, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 3, 2, 4]]

comment:16 Changed 3 years ago by Rajat Mittal

Status: newneeds_review

comment:17 Changed 3 years ago by git

Commit: 8085e8c4cc407011796c9ca74250c995b772aa935e460d7a61c9087084c8dd1c411ac81b659d4492

Branch pushed to git repo; I updated commit sha1. New commits:

5e460d7trac #28335: review edit Yen

comment:18 Changed 3 years ago by David Coudert

I did some changes in Yen. I think it is cleaner this way. I have not touched Feng. You can certainly try to follow what I did to start improving it.

comment:19 Changed 3 years ago by git

Commit: 5e460d7a61c9087084c8dd1c411ac81b659d44922f41b460be24ebeff833ff05131374c9e1027c5c

Branch pushed to git repo; I updated commit sha1. New commits:

2f41b46Cythonization and edits in Feng's

comment:20 Changed 3 years ago by Rajat Mittal

Following are the comparisons results between: There's a slight improvement observed.

sage: g = DiGraph(graphs.Grid2dGraph(10, 10))
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cyth
....: on
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cytho
....: n
sage:  p = yen_k_shortest_simple_paths(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(1000)]
1 loop, best of 3: 1.25 s per loop
sage:  p = yen_k_shortest_simple_paths_cython(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(1000)]
1 loop, best of 3: 1.24 s per loop
sage:  p = feng_k_shortest_simple_paths(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(5000)]
1 loop, best of 3: 4.62 s per loop
sage:  p = feng_k_shortest_simple_paths_cython(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(5000)]
1 loop, best of 3: 4.72 s per loop

sage: g = DiGraph()
sage: for i in range(120):
....: ....:     g.add_edge(i,i+1,i%5)
....:     
sage: for i in range(30, 40, 1):
....: ....:     g.add_edge(i,120,i%2)
....:     
sage:  %timeit list(yen_k_shortest_simple_paths_directed(g, 0, 120, by_weight=Tr
....: ue))
sage:  %timeit list(yen_k_shortest_simple_paths(g, 0, 120, by_weight=True))
10 loops, best of 3: 46.6 ms per loop
sage:  %timeit list(yen_k_shortest_simple_paths_cython(g, 0, 120, by_weight=True
....: ))
10 loops, best of 3: 45.6 ms per loop
sage:  %timeit list(feng_k_shortest_simple_paths_cython(g, 0, 120, by_weight=Tru
....: e))
100 loops, best of 3: 7.62 ms per loop
sage:  %timeit list(feng_k_shortest_simple_paths(g, 0, 120, by_weight=True))
100 loops, best of 3: 7.74 ms per loop

comment:21 Changed 3 years ago by git

Commit: 2f41b460be24ebeff833ff05131374c9e1027c5c4f4c610da59fb1f19aaee48154e5f9f0efd45368

Branch pushed to git repo; I updated commit sha1. New commits:

4f4c610minor edit

comment:22 Changed 3 years ago by David Coudert

Now that #27859 has been merged, it's much easier to check the code ;)

Instead of creating new methods, I propose to replace the code of the previous Yen's method by the code of the Cython version, and to do the same for Feng.

comment:23 Changed 3 years ago by git

Commit: 4f4c610da59fb1f19aaee48154e5f9f0efd45368a9344e892c288b3b1ad22c67aa6d8725353f1500

Branch pushed to git repo; I updated commit sha1. New commits:

a9344e8upgraded to Cython methods for Yen and Feng

comment:24 Changed 3 years ago by David Coudert

Status: needs_reviewpositive_review

LGTM. We have a better code for Yen, and I propose to postpone to another ticket possible improvements for Feng (it should be much faster, but it's not).

comment:25 Changed 3 years ago by git

Commit: a9344e892c288b3b1ad22c67aa6d8725353f1500aac00c3e0005643c0771d85d8decb431e7f13f3c
Status: positive_reviewneeds_review

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

aac00c3merged with sage 8.9beta7

comment:26 Changed 3 years ago by Rajat Mittal

I think it was necessary to rebase this tkt on 8.9beta7 otherwise some merge conflict might happen. Can you please set it to positive review again.

Last edited 3 years ago by Rajat Mittal (previous) (diff)

comment:27 Changed 3 years ago by David Coudert

Status: needs_reviewpositive_review

It was not necessary ! Firstly, I tested the ticket over beta 7, and secondly this does not prevent possible conflicts with other tickets (depends on the order in which the tickets are merged for next release).

comment:28 Changed 3 years ago by Volker Braun

Branch: public/graphs/28335_Cythonize_Yen_Fengaac00c3e0005643c0771d85d8decb431e7f13f3c
Resolution: fixed
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.