Opened 7 years ago
Closed 7 years ago
#18938 closed defect (fixed)
Refactor shortest paths
Reported by:  borassi  Owned by:  

Priority:  major  Milestone:  sage6.9 
Component:  graph theory  Keywords:  Shortest path, eccentricity, Dijkstra 
Cc:  ncohen, dcoudert  Merged in:  
Authors:  Michele Borassi  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  3f371f8 (Commits, GitHub, GitLab)  Commit:  3f371f844ecd2667b268d74fa258787f3c6da1c4 
Dependencies:  Stopgaps: 
Description (last modified by )
At the moment, there are several methods that compute shortest paths. However, there is no standard for inputs and outputs of these methods:
 some of them only work with unweighted graphs (distance_all_pairs(), distances_distribution(), eccentricity());
 some of them use default_weight (shortest_path_all_pairs), while others do not check weights (shortest_paths)
 some of them output paths, while others output predecessors (shortest_path_all_pairs).
The goal of this ticket is to standardize all these behaviors, and to make all routines work also with weighted graphs. The schema of the routine calls I would like to implement is attached (some calls are already present).
As a next step, we will include some Boost shortest path algorithms (BellmanFord, Johnson), with ticket #18931.
Attachments (1)
Change History (29)
Changed 7 years ago by
comment:1 Changed 7 years ago by
 Cc ncohen dcoudert added
 Component changed from PLEASE CHANGE to graph theory
 Description modified (diff)
 Keywords Shortest path eccentricity Dijkstra added
 Type changed from PLEASE CHANGE to defect
comment:2 Changed 7 years ago by
This is very challenging but definitely useful.
comment:3 Changed 7 years ago by
 Branch set to u/borassi/refactor_shortest_paths
comment:4 Changed 7 years ago by
 Commit set to a8d22178cff80ffe9da8d0b46c15ec30fe2d3a87
Hello!
This patch is clearly not the final version: however, before proceeding, I would like to know if you like how I defined the "standard" for shortest path functions. In particular, the new input variables are:
 by_weight: if True, the graph is considered weighted (needed for backward compatibility);
 algorithm: the algorithm (Dijkstra, BFS, etc);
 weight_function: the function that provides weights (if None, the label is used).
The result of weight_function must always be convertible to a float, otherwise exceptions are raised.
You can see the result of the refactoring of the function shortest_paths, with comments. Any feedback will be appreciated!
Best,
Michele
PS: is it true that Dijkstra is not implemented in Sage, and we rely on NetworkX? In this case, I should also insert Boost Dijkstra for efficiency (but this will be done in a subsequent ticket).
New commits:
dca3781  Temp

08e9daa  Temp

a8d2217  Modified inputs for shortest_paths routine

comment:5 followup: ↓ 7 Changed 7 years ago by
What's the proposed behavior when by_weight==False
?
Otherwise I agree with what you propose.
There is a bidirectional_dijkstra
method in src/sage/graphs/base/c_graph.pyx
. I don't know if it is used or not...
David.
comment:6 followup: ↓ 8 Changed 7 years ago by
Hellooooooooooooooo Michele,
Several remarks on your branch:
``by_weight``  if ``True``, the graph is considered weighted.
Could you be more specific by saying that the *edges* are considered to be weighted?
 Documentation of
algorithm
: could you document the default value and its behaviour?
 This sounds like a dangerous behaviour:
``weight_function`` (function)  used only if ``by_weight==True``;
If somebody callsg.shortest_paths(v,weight_function=my_function)
, we know for sure that (s)he wants the edges to be considered. Why shouldn't we defineby_weight
toTrue
whenweight_function is not None
?
 Documentation of
weight_function
: here is an attempt at making the following paragraph a bit shorter.Thea function that inputs an edge ``e`` and outputs its weight. An edge has the form ``(u,v,l)``, where ``u`` and ``v`` are vertices, ``l`` is a label (that can be of any kind). The ``weight_function`` can be used to transform the label into a weight. In particular: +a function that takes a labelled edge `(u,v,l)` as input and returns +its weight. If set to `None` when `by_weight=True`, the label `l` is +used as the edge's weight.
is_weighted
flag is not used much in Sage, and I believe that it should be removed in the long term: You never know if it is about edge/vertex weights
 Setting a graph to be "weighted" does not check anything about the vality of weights
 You may want one function to consider the graph as weighted and not another one. Changing an attribute in between is not pratical, and is actually impossible for immutable graphs.
 I do not think necessary to document which exceptions are raised when the input is bad, e.g. conversion to float. Do what you want.
 You add many arguments to this function: they should be tested in the function's doctest.
Good evening,
Nathann
comment:7 in reply to: ↑ 5 Changed 7 years ago by
There is a
bidirectional_dijkstra
method insrc/sage/graphs/base/c_graph.pyx
. I don't know if it is used or not...
It is used in GenericGraph.shortest_path
.
Nathann
comment:8 in reply to: ↑ 6 ; followup: ↓ 10 Changed 7 years ago by
Hi!
I tried to address your remarks, and I will upload the new version in a few moments. I have also started working on the other methods: still, the code is not clean, so I think you should only review the methods shortest_paths
and check_weight_function
(even if the whole code should be correct, because all tests are passed).
One question: since most of the arguments are the same for several routines, is there a way to avoid copypaste in the documentation?
Thank you very much! Good evening,
Michele
What's the proposed behavior when by_weight==False?
Added to the documentation
There is a bidirectional_dijkstra method in src/sage/graphs/base/c_graph.pyx.
Yes, but there is no "standard" Dijkstra... Maybe, if I have time before the end of the project, I will add one.
 ``by_weight``  if ``True``, the graph is considered weighted. Could you be more specific by saying that the *edges* are considered to be weighted?
Done
 Documentation of
algorithm
: could you document the default value and itsbehaviour?
Done
 This sounds like a dangerous behaviour:
``weight_function`` (function)  used only if ``by_weight==True``;
If somebody callsg.shortest_paths(v,weight_function=my_function)
, we know for sure that (s)he wants the edges to be considered. Why shouldn't we defineby_weight
toTrue
whenweight_function is not None
?
Done
 Documentation of
weight_function
: here is an attempt at making the followingparagraph a bit shorter.
a function that inputs an edge e and outputs its weight. An edge has the form (u,v,l), where u and v are vertices, l is a label (that can be of any kind). The weight_function can be used to transform the label into a weight. In particular: +a function that takes a labelled edge `(u,v,l)` as input and returns +its weight. If set to `None` when `by_weight=True`, the label `l` is +used as the edge's weight.
I have modified heavily the paragraph, taking inspiration from your proposal.
The
is_weighted
flag is not used much in Sage, and I believe that it should be removed in the long term:
 You never know if it is about edge/vertex weights
 Setting a graph to be "weighted" does not check anything about the vality of
weights
 You may want one function to consider the graph as weighted and not another
one. Changing an attribute in between is not pratical, and is actually impossible for immutable graphs.
I removed all references to is_weighted: now I just check that the weight function outputs numbers
 I do not think necessary to document which exceptions are raised when the
input is bad, e.g. conversion to float. Do what you want.
Done!
 You add many arguments to this function: they should be tested in the
function's doctest.
Added some doctests.
comment:9 Changed 7 years ago by
 Commit changed from a8d22178cff80ffe9da8d0b46c15ec30fe2d3a87 to dca691b5114e24d8a157a8b0bf7f92b2275a87e0
Branch pushed to git repo; I updated commit sha1. New commits:
dca691b  Good quality shortest_paths, implemented other routines

comment:10 in reply to: ↑ 8 Changed 7 years ago by
Hello,
I think you should only review the methods
shortest_paths
andcheck_weight_function
The second function should be a hidden one, i.e. _check_weight_function
. It is only a checking routine that we need internally. I am always wary of these routines by the way, especially when they apply to lineartime algorithms....
Anyawy. You should probably replace .edges()
by .edge_iterator()
in there, for otherwise the first line of the loop is actually as time/space consuming as a full graph copy (even worse actually, as it is pure python stuff).
One question: since most of the arguments are the same for several routines, is there a way to avoid copypaste in the documentation?
No magical way that I know. It's either this or "see the doc of <X> for more inforamtion". Or you can have a **kwds
instead of copy/pasting the arguments in the function's prototype and foward them all to the subcall, saying that "all arguments of function <X> (see its doc) are accepted". No magic, really.
If ``None`` and ``by_weight`` is ``True``, we output ``l`` by default.
We output 'l'? What about 'we use the edge's label as a weight'?
Nathann
comment:11 followup: ↓ 12 Changed 7 years ago by
Hello,
In method check_weight_function
Checks that an edge weight function outputs only numbers.
>Check that an edge weight function outputs only numbers.
 The title of the example
A wrong weight function::
must be changed. The weight function is correct but not the edge label. for e in self.edges():
>for e in self.edge_iterator():
?
In method shortest_paths
If the weight function is wrong::
same as above
Also I'm wondering if we may have conflicts with tickets #18868 about memory allocation and #18864 on eccentricity
David.
comment:12 in reply to: ↑ 11 Changed 7 years ago by
comment:13 Changed 7 years ago by
I'm not sure I understand what you have done...
comment:14 Changed 7 years ago by
comment:15 Changed 7 years ago by
 Commit changed from dca691b5114e24d8a157a8b0bf7f92b2275a87e0 to d4ceebf070104da6401a5331242a698df1c727b2
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
80cb899  trac #18864: correct version

ce9d265  trac #18864: return Sage integers

a1b36da  trac #18864: Merged with 6.8.rc1

2723175  trac #18868: a MemoryAllocator object for easier Cython memory management

bf39672  trac #18868: back to calloc

9597eec  trac #18868: Changes to MemoryAllocator

f514301  Optimize MemoryAllocator and add allocarray()

39f8839  Fix exception handling

0304d9f  trac #18868: Merged with #18864

d4ceebf  Merged with #18868.

comment:16 Changed 7 years ago by
Helloooooooo!
I have applied all your suggestions, and I have merged this ticket with #18868 in order to avoid conflicts.
Soon I will modify the other methods, and finally ask for a review!
See you, Michele
comment:17 Changed 7 years ago by
great !
comment:18 Changed 7 years ago by
 Commit changed from d4ceebf070104da6401a5331242a698df1c727b2 to 38c10d43143d985adbb57fcb514f3483ee30941d
comment:19 Changed 7 years ago by
Hello!
I have a problem building the documentation in this branch, and I really have no idea what's going on (I lost the whole morning trying to fix it, with no success). Could you help me, at least by telling me what does the following log mean? The code is in this ticket...
Error building the documentation. Traceback (most recent call last): File "/home/michele/Programmi/sage/src/doc/common/builder.py", line 1626, in <module> getattr(get_builder(name), type)() File "/home/michele/Programmi/sage/src/doc/common/builder.py", line 292, in _wrapper getattr(get_builder(document), 'inventory')(*args, **kwds) File "/home/michele/Programmi/sage/src/doc/common/builder.py", line 503, in _wrapper x.get(99999) File "/home/michele/Programmi/sage/local/lib/python/multiprocessing/pool.py", line 558, in get raise self._value OSError: [graphs ] /home/michele/Programmi/sage/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.py:docstring of sage.graphs.generic_graph.graph_isom_equivalent_non_edge_labeled_graph:25: WARNING: Explicit markup ends without a blank line; unexpected unindent. Makefile:742: recipe for target 'dochtml' failed make[2]: *** [dochtml] Error 1 make[2]: Leaving directory '/home/michele/Programmi/sage/build/make' Makefile:563: recipe for target 'all' failed make[1]: *** [all] Error 2 make[1]: Leaving directory '/home/michele/Programmi/sage/build/make'
Thank you very much!
Michele
comment:20 Changed 7 years ago by
After performing the following changes, I'm able to compile the doc.
 In
_path_length
, changeWARNING: if the graph is unweighted, the algorithm does not check that the path exists. Moreover, also if the weight_function does not return a number, an error is raised.
To
.. WARNING:: If the graph is unweighted, the algorithm does not check that the path exists. Moreover, also if the weight_function does not return a number, an error is raised.
The WARNING:
should declare a block, so may be you should insert space/blank line (not sure).
 In
wiener_index
, change.. [KRG96b] S. Klavzar, A. Rajapakse, and I. Gutman. The Szeged and the Wiener index of graphs. *Applied Mathematics Letters*, 9(5):4549, 1996.
to
.. [KRG96b] S. Klavzar, A. Rajapakse, and I. Gutman. The Szeged and the Wiener index of graphs. *Applied Mathematics Letters*, 9(5):4549, 1996.
comment:21 Changed 7 years ago by
 Commit changed from 38c10d43143d985adbb57fcb514f3483ee30941d to 760ea8746d52a8d9fbfb22621ae7063ab93c2ceb
comment:22 Changed 7 years ago by
 Status changed from new to needs_review
Finally, I have a version of the refactoring that might be reviewed. Hope you like it!
comment:23 Changed 7 years ago by
 Commit changed from 760ea8746d52a8d9fbfb22621ae7063ab93c2ceb to 5c7856142001699dbe00ad659a0ea95b66bedbf8
comment:24 Changed 7 years ago by
 Status changed from needs_review to needs_work
Hello,
The patch passes all long tests, but I have a problem when building the documentation (duplicate citation KRG96b). I don't know which is the best solution. You could eventually point to the other module documentation instead of repeating citation.
[graphs ] /Users/dcoudert/sage/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.py:docstring of sage.graphs.generic_graph.GenericGraph.wiener_index:72: WARNING: duplicate citation KRG96b, other instance in /Users/dcoudert/sage/src/doc/en/reference/graphs/sage/graphs/distances_all_pairs.rst Error building the documentation. Traceback (most recent call last): File "/Users/dcoudert/sage/src/doc/common/builder.py", line 1626, in <module> getattr(get_builder(name), type)() File "/Users/dcoudert/sage/src/doc/common/builder.py", line 292, in _wrapper getattr(get_builder(document), 'inventory')(*args, **kwds) File "/Users/dcoudert/sage/src/doc/common/builder.py", line 503, in _wrapper x.get(99999) File "/Users/dcoudert/sage/local/lib/python/multiprocessing/pool.py", line 558, in get raise self._value OSError: [graphs ] /Users/dcoudert/sage/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.py:docstring of sage.graphs.generic_graph.GenericGraph.wiener_index:72: WARNING: duplicate citation KRG96b, other instance in /Users/dcoudert/sage/src/doc/en/reference/graphs/sage/graphs/distances_all_pairs.rst make[1]: *** [dochtml] Error 1
David.
comment:25 Changed 7 years ago by
 Commit changed from 5c7856142001699dbe00ad659a0ea95b66bedbf8 to 3f371f844ecd2667b268d74fa258787f3c6da1c4
Branch pushed to git repo; I updated commit sha1. New commits:
3f371f8  Removed duplicate reference

comment:26 Changed 7 years ago by
 Status changed from needs_work to needs_review
I removed the duplicate reference: now the citation refers to the other module.
comment:27 Changed 7 years ago by
 Milestone changed from sage6.8 to sage6.9
 Reviewers set to David Coudert
 Status changed from needs_review to positive_review
Hello,
now the doc build properly and it looks fine. For me this ticket is now good to go.
Best, David.
comment:28 Changed 7 years ago by
 Branch changed from u/borassi/refactor_shortest_paths to 3f371f844ecd2667b268d74fa258787f3c6da1c4
 Resolution set to fixed
 Status changed from positive_review to closed
Dependencies between shortest path routines