#28335 closed enhancement (fixed)

Cythonize Yen_k_shortest_simple_paths and feng_k_shortest_simple_paths

Reported by: gh-rajat1433 Owned by:
Priority: major Milestone: sage-8.9
Component: graph theory Keywords: gsoc2019
Cc: dcoudert Merged in:
Authors: Rajat Mittal Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: aac00c3 (Commits) Commit: aac00c3e0005643c0771d85d8decb431e7f13f3c
Dependencies: #27859 Stopgaps:

Description

This ticket aims to implement these methods in Cython for speed up and performance gains.

Change History (28)

comment:1 Changed 18 months ago by gh-rajat1433

  • Authors set to Rajat Mittal
  • Branch set to public/graphs/28335_Cythonize_Yen_Feng

comment:2 Changed 18 months ago by git

  • Commit set to 12ebc7080e850f0dbe71a305448e5f4dca9c062a

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

7871701adding comments
7c9bc30trac #27859: fix merge conflict with 8.9.beta5
3ae912dtrac #27859: add test on random digraphs
82015e2made feng faster
4bdbe63some improvement
fabe87btrac #27859: review edit in Yen
3a2aca3trac #27859: review edit in Feng
034cb7fMerge branch 'public/graphs/27859_enumerate_paths' of git://trac.sagemath.org/sage into public/graphs/27859_enumerate_paths
fcfa3e6removing extra newline
12ebc70cythonizing Yen and Feng

comment:3 Changed 18 months ago by gh-rajat1433

I have cythonized the Yen and Feng's methods here as can be seen from the last commit in the list of above commits.

In short I have used cython variables and cython data structures where ever possible in these methods.

Also I have named the cython methods as yen_k_shortest_simple_paths_cython and feng_k_shortest_simple_paths_cython to separate them from yen_k_shortest_simple_paths and feng_k_shortest_simple_paths for consistency and speed checks. feng_k_shortest_simple_paths Changes will be more clear once #27859 gets included.

Last edited 18 months ago by gh-rajat1433 (previous) (diff)

comment:4 follow-up: Changed 18 months ago by dcoudert

The way you declare the priority queue is incorrect. This will fail if edge weights are not integers.

The list idx_to_path stores all paths, and you never remove any. So it can become giant. Can't you instead use a dictionary and a counter to assign each new path a unique id, and remove paths that are no longer useful ? In fact, you can certainly combine idx_to_path and heap_paths in a unique dictionary, no ?

comment:5 Changed 18 months ago by git

  • Commit changed from 12ebc7080e850f0dbe71a305448e5f4dca9c062a to c93798417e0f296a13eac1539094b12f2fa75ea0

Branch pushed to git repo; I updated commit sha1. New commits:

c937984Improved with using Dictionary and DOUBLE

comment:6 in reply to: ↑ 4 Changed 18 months ago by gh-rajat1433

Replying to dcoudert:

The way you declare the priority queue is incorrect. This will fail if edge weights are not integers.

The list idx_to_path stores all paths, and you never remove any. So it can become giant. Can't you instead use a dictionary and a counter to assign each new path a unique id, and remove paths that are no longer useful ? In fact, you can certainly combine idx_to_path and heap_paths in a unique dictionary, no ?

I agree with both the statements and made the changes appropriately

comment:7 Changed 18 months ago by dcoudert

Seems correct. The next step is to speed up the test hash_path not in idx_to_path.values(), but I don't have good ideas yet on how to do that.

comment:8 Changed 18 months ago by git

  • Commit changed from c93798417e0f296a13eac1539094b12f2fa75ea0 to a72071b9fcbbeb96091a43f5ade1596acfa4b2da

Branch pushed to git repo; I updated commit sha1. New commits:

a72071bremoval of unncessary tests

comment:9 Changed 18 months ago by gh-rajat1433

sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cyth
....: on
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cytho
....: n
sage: sage: G = digraphs.RandomDirectedGNP(30, .05)
....:         sage: while not G.is_strongly_connected():
....:         ....:     G = digraphs.RandomDirectedGNP(30, .1)
....:         sage: for u, v in list(G.edges(labels=False, sort=False)):
....:         ....:     G.set_edge_label(u, v, randint(1, 10))
....:         sage: V = G.vertices()
....:         sage: shuffle(V)
....:         sage: u, v = V[:2]
sage: it_Y = feng_k_shortest_simple_paths_cython(G, u, v, by_weight=True, report
....: _weight=True)
sage: it_F = feng_k_shortest_simple_paths(G, u, v, by_weight=True, report_weight
....: =True)
sage: for i in range(205):
....:     if(it_F.next()[0]!=it_Y.next()[0]):
....:         print("wrong")
....:         
....:     
....: 
sage: it_Y = yen_k_shortest_simple_paths_cython(G, u, v, by_weight=True, report_
....: weight=True)
sage: it_F = yen_k_shortest_simple_paths(G, u, v, by_weight=True, report_weight=
....: True)
sage: for i in range(205):
....:     if(it_F.next()[0]!=it_Y.next()[0]):
....:         print("wrong")
....:         
....:     
....:

comment:10 Changed 18 months ago by gh-rajat1433

After thinking about the utility of the hash_path not in idx_to_path.values() test , it seems like we don't really require it as per our algorithm as we are working on graphs without multiedges hence the path will no be repeated into the heap. So this test is unnecessary here.

Also the test supporting it is shown in the above comment.

comment:11 Changed 18 months ago by git

  • Commit changed from a72071b9fcbbeb96091a43f5ade1596acfa4b2da to 8c022dd0dee9e6a0c55206d54599dda3a27611de

Branch pushed to git repo; I updated commit sha1. New commits:

8c022ddremoving hash_paths

comment:12 Changed 18 months ago by git

  • Commit changed from 8c022dd0dee9e6a0c55206d54599dda3a27611de to 8085e8c4cc407011796c9ca74250c995b772aa93

Branch pushed to git repo; I updated commit sha1. New commits:

8085e8cmerged with sage 8.9beta6

comment:13 follow-up: Changed 18 months ago by dcoudert

I'm not fully convinced by the removal of this test. It is part of the algorithm for a reason. Consider an unweighted graph with many paths from u to v with same cost. For instance a complete graph. A same path might certainly been found several times, no ?

comment:14 in reply to: ↑ 13 Changed 18 months ago by gh-rajat1433

Replying to dcoudert:

I'm not fully convinced by the removal of this test. It is part of the algorithm for a reason. Consider an unweighted graph with many paths from u to v with same cost. For instance a complete graph. A same path might certainly been found several times, no ?

Actual algorithm never specify to do this test. I added this test thinking that a path may be repeated in the heap, but the way Yen's algorithm unfolds its not possible to add the same path twice.

sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cython
sage: g = graphs.CompleteGraph(5)
sage: it_Y = yen_k_shortest_simple_paths_cython(g, 0, 4)
sage: list(it_Y)
[[0, 4],
 [0, 1, 4],
 [0, 2, 4],
 [0, 3, 4],
 [0, 3, 1, 4],
 [0, 3, 2, 4],
 [0, 2, 1, 4],
 [0, 2, 3, 4],
 [0, 1, 2, 4],
 [0, 1, 3, 4],
 [0, 1, 3, 2, 4],
 [0, 1, 2, 3, 4],
 [0, 2, 3, 1, 4],
 [0, 2, 1, 3, 4],
 [0, 3, 2, 1, 4],
 [0, 3, 1, 2, 4]]

sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths
sage: g = graphs.CompleteGraph(5)
sage: it_Y = yen_k_shortest_simple_paths(g, 0, 4)
sage: list(it_Y)
[[0, 4],
 [0, 1, 4],
 [0, 2, 4],
 [0, 3, 4],
 [0, 1, 2, 4],
 [0, 1, 3, 4],
 [0, 2, 1, 4],
 [0, 2, 3, 4],
 [0, 3, 1, 4],
 [0, 3, 2, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 3, 2, 4],
 [0, 2, 1, 3, 4],
 [0, 2, 3, 1, 4],
 [0, 3, 1, 2, 4],
 [0, 3, 2, 1, 4]]

comment:15 Changed 18 months ago by gh-rajat1433

sage: g = g.to_directed()
sage: show(g)
Launched png viewer for Graphics object consisting of 30 graphics primitives
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cython
sage: it_F = feng_k_shortest_simple_paths_cython(g, 0, 4)
sage: list(it_F)
[[0, 4],
 [0, 3, 4],
 [0, 2, 4],
 [0, 1, 4],
 [0, 1, 3, 4],
 [0, 1, 2, 4],
 [0, 2, 3, 4],
 [0, 2, 1, 4],
 [0, 3, 2, 4],
 [0, 3, 1, 4],
 [0, 3, 1, 2, 4],
 [0, 3, 2, 1, 4],
 [0, 2, 1, 3, 4],
 [0, 2, 3, 1, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 3, 2, 4]]

comment:16 Changed 18 months ago by gh-rajat1433

  • Status changed from new to needs_review

comment:17 Changed 18 months ago by git

  • Commit changed from 8085e8c4cc407011796c9ca74250c995b772aa93 to 5e460d7a61c9087084c8dd1c411ac81b659d4492

Branch pushed to git repo; I updated commit sha1. New commits:

5e460d7trac #28335: review edit Yen

comment:18 Changed 18 months ago by dcoudert

I did some changes in Yen. I think it is cleaner this way. I have not touched Feng. You can certainly try to follow what I did to start improving it.

comment:19 Changed 17 months ago by git

  • Commit changed from 5e460d7a61c9087084c8dd1c411ac81b659d4492 to 2f41b460be24ebeff833ff05131374c9e1027c5c

Branch pushed to git repo; I updated commit sha1. New commits:

2f41b46Cythonization and edits in Feng's

comment:20 Changed 17 months ago by gh-rajat1433

Following are the comparisons results between: There's a slight improvement observed.

sage: g = DiGraph(graphs.Grid2dGraph(10, 10))
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths_cyth
....: on
sage: from sage.graphs.path_enumeration import feng_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths
sage: from sage.graphs.path_enumeration import yen_k_shortest_simple_paths_cytho
....: n
sage:  p = yen_k_shortest_simple_paths(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(1000)]
1 loop, best of 3: 1.25 s per loop
sage:  p = yen_k_shortest_simple_paths_cython(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(1000)]
1 loop, best of 3: 1.24 s per loop
sage:  p = feng_k_shortest_simple_paths(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(5000)]
1 loop, best of 3: 4.62 s per loop
sage:  p = feng_k_shortest_simple_paths_cython(g, (0, 0), (9, 9))
sage: %timeit [p.next() for i in range(5000)]
1 loop, best of 3: 4.72 s per loop

sage: g = DiGraph()
sage: for i in range(120):
....: ....:     g.add_edge(i,i+1,i%5)
....:     
sage: for i in range(30, 40, 1):
....: ....:     g.add_edge(i,120,i%2)
....:     
sage:  %timeit list(yen_k_shortest_simple_paths_directed(g, 0, 120, by_weight=Tr
....: ue))
sage:  %timeit list(yen_k_shortest_simple_paths(g, 0, 120, by_weight=True))
10 loops, best of 3: 46.6 ms per loop
sage:  %timeit list(yen_k_shortest_simple_paths_cython(g, 0, 120, by_weight=True
....: ))
10 loops, best of 3: 45.6 ms per loop
sage:  %timeit list(feng_k_shortest_simple_paths_cython(g, 0, 120, by_weight=Tru
....: e))
100 loops, best of 3: 7.62 ms per loop
sage:  %timeit list(feng_k_shortest_simple_paths(g, 0, 120, by_weight=True))
100 loops, best of 3: 7.74 ms per loop

comment:21 Changed 17 months ago by git

  • Commit changed from 2f41b460be24ebeff833ff05131374c9e1027c5c to 4f4c610da59fb1f19aaee48154e5f9f0efd45368

Branch pushed to git repo; I updated commit sha1. New commits:

4f4c610minor edit

comment:22 Changed 17 months ago by dcoudert

Now that #27859 has been merged, it's much easier to check the code ;)

Instead of creating new methods, I propose to replace the code of the previous Yen's method by the code of the Cython version, and to do the same for Feng.

comment:23 Changed 17 months ago by git

  • Commit changed from 4f4c610da59fb1f19aaee48154e5f9f0efd45368 to a9344e892c288b3b1ad22c67aa6d8725353f1500

Branch pushed to git repo; I updated commit sha1. New commits:

a9344e8upgraded to Cython methods for Yen and Feng

comment:24 Changed 17 months ago by dcoudert

  • Status changed from needs_review to positive_review

LGTM. We have a better code for Yen, and I propose to postpone to another ticket possible improvements for Feng (it should be much faster, but it's not).

comment:25 Changed 17 months ago by git

  • Commit changed from a9344e892c288b3b1ad22c67aa6d8725353f1500 to aac00c3e0005643c0771d85d8decb431e7f13f3c
  • Status changed from positive_review to needs_review

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

aac00c3merged with sage 8.9beta7

comment:26 Changed 17 months ago by gh-rajat1433

I think it was necessary to rebase this tkt on 8.9beta7 otherwise some merge conflict might happen. Can you please set it to positive review again.

Last edited 17 months ago by gh-rajat1433 (previous) (diff)

comment:27 Changed 17 months ago by dcoudert

  • Status changed from needs_review to positive_review

It was not necessary ! Firstly, I tested the ticket over beta 7, and secondly this does not prevent possible conflicts with other tickets (depends on the order in which the tickets are merged for next release).

comment:28 Changed 17 months ago by vbraun

  • Branch changed from public/graphs/28335_Cythonize_Yen_Feng to aac00c3e0005643c0771d85d8decb431e7f13f3c
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.