Opened 3 years ago

Closed 3 years ago

#28220 closed enhancement (fixed)

Cythonize all_simple_paths, all_paths_iterator and _all_paths_iterator

Reported by: Rajat Mittal Owned by:
Priority: major Milestone: sage-8.9
Component: graph theory Keywords: gsoc19
Cc: David Coudert Merged in:
Authors: Rajat Mittal Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: 20687b9 (Commits, GitHub, GitLab) Commit: 20687b94f822ea728c1fe3ca4c0d27451505c119
Dependencies: #27859 Stopgaps:

Status badges

Description

This ticket aims to cythonize the above methods using cython variables and cython data structures to make them fast and efficient. It depends on #27859 in which these methods were moved to separate file path_enumeration.pyx

Change History (33)

comment:1 Changed 3 years ago by David Coudert

Create distinct methods to make sure you can compare results.

comment:2 Changed 3 years ago by Rajat Mittal

Branch: public/graphs/28220_cythonize_paths

comment:3 Changed 3 years ago by git

Commit: 7e193cf9803387f4368bdee444766154c81f2872

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

6eb771dmerge conflict solved
8a22e2eupdated
eab8dbaremoved bugs, worked on a copy of original graph, added a doctest
32d1bc8added exclude vertices
fbc0faafix bugs in feng
ced65d4fix case of no paths bw vertices in Feng
dc98836correct pep8
362ff4bbased on #28221
f02d6e5trac #27859 merged with 8.9beta4
7e193cfcythonized using cython data structures

comment:4 Changed 3 years ago by Rajat Mittal

Status: newneeds_review

comment:5 Changed 3 years ago by git

Commit: 7e193cf9803387f4368bdee444766154c81f287215180791edd17ca1045b90c2edccb079dfc8176f

Branch pushed to git repo; I updated commit sha1. New commits:

1518079improved cython code

comment:6 Changed 3 years ago by David Coudert

Can you summarize what you did and what else you plan to do. It's not easy to see from the code (ok, the last commit is clear).

comment:7 Changed 3 years ago by Rajat Mittal

I have added three new methods all_simple_paths_cython, all_paths_iterator_cython and _all_paths_iterator_cython and used the priority_queue data structure instead of python heap. I have also used cython variable declarations. I have tested the time complexity and the improvement in time is very minute. I am looking for more cythonic style improvements. If you have any in mind w.r.t. these methods it will be helpful here.

Last edited 3 years ago by Rajat Mittal (previous) (diff)

comment:8 Changed 3 years ago by Rajat Mittal

Also I have based it on top of #27859.

comment:9 Changed 3 years ago by git

Commit: 15180791edd17ca1045b90c2edccb079dfc8176f8c4a92a8972b1503ce129468e8c77f18053431f7

Branch pushed to git repo; I updated commit sha1. New commits:

5839ca6improvements
8c4a92amerged with latest version of #27859

comment:10 Changed 3 years ago by Rajat Mittal

sage: g = digraphs.Paley(11)
sage: %timeit l = g.all_simple_paths(starting_vertices=[0], ending_vertices=[9],
....:  trivial=False)
1 loop, best of 3: 737 ms per loop
sage: 
sage: %timeit l = g.all_simple_paths_cython(starting_vertices=[0], ending_vertic
....: es=[9], trivial=False)
1 loop, best of 3: 722 ms per loop

sage: d = digraphs.DeBruijn(3, 2)
sage: %timeit l = d.all_simple_paths_cython(starting_vertices=['00'], ending_ver
....: tices=['22'], trivial=False)
100 loops, best of 3: 2.16 ms per loop
sage: %timeit l = d.all_simple_paths(starting_vertices=['00'], ending_vertices=[
....: '22'], trivial=False)
100 loops, best of 3: 2.44 ms per loop

comment:11 Changed 3 years ago by git

Commit: 8c4a92a8972b1503ce129468e8c77f18053431f734b0e95f2c7aa82e9588fd8eaee641f1ab9c534c

Branch pushed to git repo; I updated commit sha1. New commits:

34b0e95corrected

comment:12 Changed 3 years ago by git

Commit: 34b0e95f2c7aa82e9588fd8eaee641f1ab9c534c85be894df27f726dca2d79ee1bae6347fcb9e504

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

f503ec8fixed potetial bugs in feng's
7871701adding comments
7c9bc30trac #27859: fix merge conflict with 8.9.beta5
3ae912dtrac #27859: add test on random digraphs
82015e2made feng faster
4bdbe63some improvement
fabe87btrac #27859: review edit in Yen
3a2aca3trac #27859: review edit in Feng
034cb7fMerge branch 'public/graphs/27859_enumerate_paths' of git://trac.sagemath.org/sage into public/graphs/27859_enumerate_paths
85be894merged with #27859

comment:13 Changed 3 years ago by David Coudert

I'm having hard time identifying the changes done in this ticket, so I will wait for the inclusion of #27859 before starting to review it.

comment:14 Changed 3 years ago by git

Commit: 85be894df27f726dca2d79ee1bae6347fcb9e504686a738f0cff7bec785d7a4b6e6779306426c99b

Branch pushed to git repo; I updated commit sha1. New commits:

686a738minor changes

comment:15 Changed 3 years ago by git

Commit: 686a738f0cff7bec785d7a4b6e6779306426c99ba0ccabf40cc7693b52207abcebf190054a965d8a

Branch pushed to git repo; I updated commit sha1. New commits:

fcfa3e6removing extra newline
a0ccabfmerged with latest version of #27859

comment:16 Changed 3 years ago by git

Commit: a0ccabf40cc7693b52207abcebf190054a965d8aee12bd0cb01ac6e819e94e1a4067ab272e8a9861

Branch pushed to git repo; I updated commit sha1. New commits:

ee12bd0merged with sage 8.9beta6

comment:17 Changed 3 years ago by David Coudert

You should do as for #28335: modify existing methods instead of creating new ones. There are no big changes in the methods justifying the creation of new methods.

comment:18 Changed 3 years ago by git

Commit: ee12bd0cb01ac6e819e94e1a4067ab272e8a9861f1bb530ba027a2f2a1c7bf978ad7b24cb759b709

Branch pushed to git repo; I updated commit sha1. New commits:

1c97907merged with sage 8.9beta7
f1bb530Merged Methods

comment:19 Changed 3 years ago by Rajat Mittal

I had created new methods to compare the time complexities etc. Now the methods have been merged and written in Cython.

comment:20 in reply to:  19 Changed 3 years ago by David Coudert

Status: needs_reviewneeds_work

Replying to gh-rajat1433:

I had created new methods to compare the time complexities etc.

Right, I did not though about it. Sorry.

Now the methods have been merged and written in Cython.

Well, if I understand well what you did, you simply declared the type of some variables and used a c++ priority queue instead of a Python queue at the cost of storing all reported paths in a list. This is not good enough. and I disagree with the current use of idx_to_path as this list may become giant.

Can't you try to identify first which parts of the methods are critical and could certainly be improved ?

comment:21 Changed 3 years ago by git

Commit: f1bb530ba027a2f2a1c7bf978ad7b24cb759b709c3946c6033379c09e1c3cd0c2ea5fefabe59127c

Branch pushed to git repo; I updated commit sha1. New commits:

c3946c6used dict instead of a list

comment:22 Changed 3 years ago by Rajat Mittal

all_paths_iterator methods calls _all_paths_iterator method to get all the paths starting from a particular vertex and ending at a list of ending vertices. But as we see the code of _all_paths_iterator method critical part there seems to be to get to the ending vertex from start but parameter max_length suggests to go step by step to ending vertex which the algorithm does. It explores all the paths to the ending vertices by putting them in the queue.

I don't see much of a scope of improvement from algorithm point of view right now in these methods. If you have any ideas , please let me know. But these methods are currently in cython now, in alignment with other methods of this module path_enumeration.pyx and faster.

comment:23 Changed 3 years ago by David Coudert

There is a merge conflict with last beta.

comment:24 Changed 3 years ago by git

Commit: c3946c6033379c09e1c3cd0c2ea5fefabe59127c6c65e63ced41653dbe91320d81281c02621ab17d

Branch pushed to git repo; I updated commit sha1. New commits:

6c65e63resolved merge conflict

comment:25 Changed 3 years ago by Rajat Mittal

Status: needs_workneeds_review

comment:26 Changed 3 years ago by Rajat Mittal

I have rebased it on 8.9 beta8 and resolved the merge conflict.

comment:27 Changed 3 years ago by git

Commit: 6c65e63ced41653dbe91320d81281c02621ab17d20687b94f822ea728c1fe3ca4c0d27451505c119

Branch pushed to git repo; I updated commit sha1. New commits:

20687b9trac #28220: fix doctest with Python 3

comment:28 Changed 3 years ago by David Coudert

I changed 2 doctests to make them compatible with Python 3 (were failing when testing with py3).You may know that we are in the transition to py3 and so some of us are running both a version compiled with py2 and another with py3. For some data structures, like dict, the order in which items are "ordered" change. In this case, it was changing the order in which paths where yielded. The solution was correct nonetheless. With the change I made, it's more robust.

comment:29 Changed 3 years ago by David Coudert

A good news:

with this patch:

sage: g = digraphs.Complete(6)
sage: %time _ = g.all_simple_paths()
CPU times: user 24.9 ms, sys: 1.73 ms, total: 26.7 ms
Wall time: 25.7 ms

without:

sage: g = digraphs.Complete(6)
sage: %time _ = g.all_simple_paths()
CPU times: user 53 ms, sys: 8.88 ms, total: 61.9 ms
Wall time: 55.8 ms

comment:30 Changed 3 years ago by David Coudert

A bad news: method all_simple_paths uses all_paths_iterator, and this method claim that The paths are enumerated in increasing length order. However, this is not the case (with and without this patch).

sage: g = digraphs.Complete(3)
sage: g.all_simple_paths()
[[0, 1],
 [1, 0],
 [2, 0],
 [0, 1, 0],
 [0, 2],
 [0, 2, 0],
 [0, 1, 2],
 [1, 0, 1],
 [1, 2],
 [1, 2, 1],
 [1, 0, 2],
 [2, 0, 2],
 [2, 1],
 [2, 1, 2],
 [2, 0, 1],
 [0, 1, 2, 0],
 [0, 2, 1],
 [0, 2, 1, 0],
 [1, 0, 2, 1],
 [1, 2, 0],
 [1, 2, 0, 1],
 [2, 0, 1, 2],
 [2, 1, 0],
 [2, 1, 0, 2]]

What if we used a priority queue instead of a queue in _all_paths_iterator ? As this behavior seems intependent of the changes done here, this can be post-pone to a future patch.

comment:31 Changed 3 years ago by Rajat Mittal

It seems that when starting_vertices and ending_vertices are not single vertex then these paths are not in increasing no. of edges.

"What if we used a priority queue instead of a queue in _all_paths_iterator ? As this behavior seems intependent of the changes done here, this can be post-pone to a future patch. "

I tried it but still the results are not in increasing order of edges.

We can fix it in next patch either to reorganize the algorithm to print paths in order of increasing number of edges or stating in simple paths that edges will be in increasing order in case of a single src and destination.

comment:32 Changed 3 years ago by David Coudert

Status: needs_reviewpositive_review

OK, let's postpone the ordering issue to a specific ticket. You can investigate that before opening a ticket.

LGTM.

comment:33 Changed 3 years ago by Volker Braun

Branch: public/graphs/28220_cythonize_paths20687b94f822ea728c1fe3ca4c0d27451505c119
Resolution: fixed
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.