Opened 7 years ago

Closed 7 years ago

#18938 closed defect (fixed)

Refactor shortest paths

Reported by: borassi Owned by:
Priority: major Milestone: sage-6.9
Component: graph theory Keywords: Shortest path, eccentricity, Dijkstra
Cc: ncohen, dcoudert Merged in:
Authors: Michele Borassi Reviewers: David Coudert
Report Upstream: N/A Work issues:
Branch: 3f371f8 (Commits, GitHub, GitLab) Commit: 3f371f844ecd2667b268d74fa258787f3c6da1c4
Dependencies: Stopgaps:

Status badges

Description (last modified by borassi)

At the moment, there are several methods that compute shortest paths. However, there is no standard for inputs and outputs of these methods:

  • some of them only work with unweighted graphs (distance_all_pairs(), distances_distribution(), eccentricity());
  • some of them use default_weight (shortest_path_all_pairs), while others do not check weights (shortest_paths)
  • some of them output paths, while others output predecessors (shortest_path_all_pairs).

The goal of this ticket is to standardize all these behaviors, and to make all routines work also with weighted graphs. The schema of the routine calls I would like to implement is attached (some calls are already present).

As a next step, we will include some Boost shortest path algorithms (Bellman-Ford, Johnson), with ticket #18931.

Attachments (1)

Dependencies.png (52.2 KB) - added by borassi 7 years ago.
Dependencies between shortest path routines

Download all attachments as: .zip

Change History (29)

Changed 7 years ago by borassi

Dependencies between shortest path routines

comment:1 Changed 7 years ago by borassi

  • Authors set to Michele Borassi
  • Cc ncohen dcoudert added
  • Component changed from PLEASE CHANGE to graph theory
  • Description modified (diff)
  • Keywords Shortest path eccentricity Dijkstra added
  • Type changed from PLEASE CHANGE to defect

comment:2 Changed 7 years ago by dcoudert

This is very challenging but definitely useful.

comment:3 Changed 7 years ago by borassi

  • Branch set to u/borassi/refactor_shortest_paths

comment:4 Changed 7 years ago by borassi

  • Commit set to a8d22178cff80ffe9da8d0b46c15ec30fe2d3a87

Hello!

This patch is clearly not the final version: however, before proceeding, I would like to know if you like how I defined the "standard" for shortest path functions. In particular, the new input variables are:

  • by_weight: if True, the graph is considered weighted (needed for backward compatibility);
  • algorithm: the algorithm (Dijkstra, BFS, etc);
  • weight_function: the function that provides weights (if None, the label is used).

The result of weight_function must always be convertible to a float, otherwise exceptions are raised.

You can see the result of the refactoring of the function shortest_paths, with comments. Any feedback will be appreciated!

Best,

Michele

PS: is it true that Dijkstra is not implemented in Sage, and we rely on NetworkX? In this case, I should also insert Boost Dijkstra for efficiency (but this will be done in a subsequent ticket).


New commits:

dca3781Temp
08e9daaTemp
a8d2217Modified inputs for shortest_paths routine

comment:5 follow-up: Changed 7 years ago by dcoudert

What's the proposed behavior when by_weight==False? Otherwise I agree with what you propose.

There is a bidirectional_dijkstra method in src/sage/graphs/base/c_graph.pyx. I don't know if it is used or not...

David.

comment:6 follow-up: Changed 7 years ago by ncohen

Hellooooooooooooooo Michele,

Several remarks on your branch:

  • ``by_weight`` - if ``True``, the graph is considered weighted. Could you be more specific by saying that the *edges* are considered to be weighted?
  • Documentation of algorithm: could you document the default value and its behaviour?
  • This sounds like a dangerous behaviour: ``weight_function`` (function) - used only if ``by_weight==True``; If somebody calls g.shortest_paths(v,weight_function=my_function), we know for sure that (s)he wants the edges to be considered. Why shouldn't we define by_weight to True when weight_function is not None?
  • Documentation of weight_function: here is an attempt at making the following paragraph a bit shorter.
    -a function that inputs an edge ``e`` and outputs its weight. An edge
    -has the form ``(u,v,l)``, where ``u`` and ``v`` are vertices, ``l`` is
    -a label (that can be of any kind). The ``weight_function`` can be used
    -to transform the label into a weight. In particular:
    +a function that takes a labelled edge `(u,v,l)` as input and returns
    +its weight. If set to `None` when `by_weight=True`, the label `l` is
    +used as the edge's weight.
    
    The is_weighted flag is not used much in Sage, and I believe that it should be removed in the long term:
    • You never know if it is about edge/vertex weights
    • Setting a graph to be "weighted" does not check anything about the vality of weights
    • You may want one function to consider the graph as weighted and not another one. Changing an attribute in between is not pratical, and is actually impossible for immutable graphs.
  • I do not think necessary to document which exceptions are raised when the input is bad, e.g. conversion to float. Do what you want.
  • You add many arguments to this function: they should be tested in the function's doctest.

Good evening,

Nathann

comment:7 in reply to: ↑ 5 Changed 7 years ago by ncohen

There is a bidirectional_dijkstra method in src/sage/graphs/base/c_graph.pyx. I don't know if it is used or not...

It is used in GenericGraph.shortest_path.

Nathann

comment:8 in reply to: ↑ 6 ; follow-up: Changed 7 years ago by borassi

Hi!

I tried to address your remarks, and I will upload the new version in a few moments. I have also started working on the other methods: still, the code is not clean, so I think you should only review the methods shortest_paths and check_weight_function (even if the whole code should be correct, because all tests are passed).

One question: since most of the arguments are the same for several routines, is there a way to avoid copy-paste in the documentation?

Thank you very much! Good evening,

Michele

What's the proposed behavior when by_weight==False?

Added to the documentation

There is a bidirectional_dijkstra method in src/sage/graphs/base/c_graph.pyx.

Yes, but there is no "standard" Dijkstra... Maybe, if I have time before the end of the project, I will add one.

  • ``by_weight`` - if ``True``, the graph is considered weighted. Could you be more specific by saying that the *edges* are considered to be weighted?

Done

  • Documentation of algorithm: could you document the default value and its

behaviour?

Done

  • This sounds like a dangerous behaviour:

``weight_function`` (function) - used only if ``by_weight==True``; If somebody calls g.shortest_paths(v,weight_function=my_function), we know for sure that (s)he wants the edges to be considered. Why shouldn't we define by_weight to True when weight_function is not None?

Done

  • Documentation of weight_function: here is an attempt at making the following

paragraph a bit shorter.

-a function that inputs an edge e and outputs its weight. An edge
-has the form (u,v,l), where u and v are vertices, l is
-a label (that can be of any kind). The weight_function can be used
-to transform the label into a weight. In particular:
+a function that takes a labelled edge `(u,v,l)` as input and returns
+its weight. If set to `None` when `by_weight=True`, the label `l` is
+used as the edge's weight.

I have modified heavily the paragraph, taking inspiration from your proposal.

The is_weighted flag is not used much in Sage, and I believe that it should be removed in the long term:

  • You never know if it is about edge/vertex weights
  • Setting a graph to be "weighted" does not check anything about the vality of

weights

  • You may want one function to consider the graph as weighted and not another

one. Changing an attribute in between is not pratical, and is actually impossible for immutable graphs.

I removed all references to is_weighted: now I just check that the weight function outputs numbers

  • I do not think necessary to document which exceptions are raised when the

input is bad, e.g. conversion to float. Do what you want.

Done!

  • You add many arguments to this function: they should be tested in the

function's doctest.

Added some doctests.

comment:9 Changed 7 years ago by git

  • Commit changed from a8d22178cff80ffe9da8d0b46c15ec30fe2d3a87 to dca691b5114e24d8a157a8b0bf7f92b2275a87e0

Branch pushed to git repo; I updated commit sha1. New commits:

dca691bGood quality shortest_paths, implemented other routines

comment:10 in reply to: ↑ 8 Changed 7 years ago by ncohen

Hello,

I think you should only review the methods shortest_paths and check_weight_function

The second function should be a hidden one, i.e. _check_weight_function. It is only a checking routine that we need internally. I am always wary of these routines by the way, especially when they apply to linear-time algorithms....

Anyawy. You should probably replace .edges() by .edge_iterator() in there, for otherwise the first line of the loop is actually as time/space consuming as a full graph copy (even worse actually, as it is pure python stuff).

One question: since most of the arguments are the same for several routines, is there a way to avoid copy-paste in the documentation?

No magical way that I know. It's either this or "see the doc of <X> for more inforamtion". Or you can have a **kwds instead of copy/pasting the arguments in the function's prototype and foward them all to the subcall, saying that "all arguments of function <X> (see its doc) are accepted". No magic, really.

  • If ``None`` and ``by_weight`` is ``True``, we output ``l`` by default. We output 'l'? What about 'we use the edge's label as a weight'?

Nathann

comment:11 follow-up: Changed 7 years ago by dcoudert

Hello,

In method check_weight_function

  • Checks that an edge weight function outputs only numbers. -> Check that an edge weight function outputs only numbers.
  • The title of the example A wrong weight function:: must be changed. The weight function is correct but not the edge label.
  • for e in self.edges(): -> for e in self.edge_iterator(): ?

In method shortest_paths

  • If the weight function is wrong:: same as above

Also I'm wondering if we may have conflicts with tickets #18868 about memory allocation and #18864 on eccentricity

David.

comment:12 in reply to: ↑ 11 Changed 7 years ago by ncohen

Also I'm wondering if we may have conflicts with tickets #18868 about memory allocation and #18864 on eccentricity

There was. I just fixed it.

Nathann

comment:13 Changed 7 years ago by dcoudert

I'm not sure I understand what you have done...

comment:14 Changed 7 years ago by ncohen

I made #18864 a dependency of #18868. And in #18868 I handled the conflict.

comment:15 Changed 7 years ago by git

  • Commit changed from dca691b5114e24d8a157a8b0bf7f92b2275a87e0 to d4ceebf070104da6401a5331242a698df1c727b2

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

80cb899trac #18864: correct version
ce9d265trac #18864: return Sage integers
a1b36datrac #18864: Merged with 6.8.rc1
2723175trac #18868: a MemoryAllocator object for easier Cython memory management
bf39672trac #18868: back to calloc
9597eectrac #18868: Changes to MemoryAllocator
f514301Optimize MemoryAllocator and add allocarray()
39f8839Fix exception handling
0304d9ftrac #18868: Merged with #18864
d4ceebfMerged with #18868.

comment:16 Changed 7 years ago by borassi

Helloooooooo!

I have applied all your suggestions, and I have merged this ticket with #18868 in order to avoid conflicts.

Soon I will modify the other methods, and finally ask for a review!

See you, Michele

comment:17 Changed 7 years ago by dcoudert

great !

comment:18 Changed 7 years ago by git

  • Commit changed from d4ceebf070104da6401a5331242a698df1c727b2 to 38c10d43143d985adbb57fcb514f3483ee30941d

Branch pushed to git repo; I updated commit sha1. New commits:

545b7bbTemporary
cd88904Temp
38c10d4Problem with documentation

comment:19 Changed 7 years ago by borassi

Hello!

I have a problem building the documentation in this branch, and I really have no idea what's going on (I lost the whole morning trying to fix it, with no success). Could you help me, at least by telling me what does the following log mean? The code is in this ticket...

Error building the documentation.
Traceback (most recent call last):
  File "/home/michele/Programmi/sage/src/doc/common/builder.py", line 1626, in <module>
    getattr(get_builder(name), type)()
  File "/home/michele/Programmi/sage/src/doc/common/builder.py", line 292, in _wrapper
    getattr(get_builder(document), 'inventory')(*args, **kwds)
  File "/home/michele/Programmi/sage/src/doc/common/builder.py", line 503, in _wrapper
    x.get(99999)
  File "/home/michele/Programmi/sage/local/lib/python/multiprocessing/pool.py", line 558, in get
    raise self._value
OSError: [graphs   ] /home/michele/Programmi/sage/local/lib/python2.7/site-packages/sage/graphs/generic_graph.py:docstring of sage.graphs.generic_graph.graph_isom_equivalent_non_edge_labeled_graph:25: WARNING: Explicit markup ends without a blank line; unexpected unindent.

Makefile:742: recipe for target 'doc-html' failed
make[2]: *** [doc-html] Error 1
make[2]: Leaving directory '/home/michele/Programmi/sage/build/make'
Makefile:563: recipe for target 'all' failed
make[1]: *** [all] Error 2
make[1]: Leaving directory '/home/michele/Programmi/sage/build/make'

Thank you very much!

Michele

comment:20 Changed 7 years ago by dcoudert

After performing the following changes, I'm able to compile the doc.

  • In _path_length, change
           WARNING: if the graph is unweighted, the algorithm does not check that
           the path exists. Moreover, also if the weight_function does not return a
           number, an error is raised.
    

To

        .. WARNING::

            If the graph is unweighted, the algorithm does not check that the
            path exists. Moreover, also if the weight_function does not return a
            number, an error is raised.

The WARNING: should declare a block, so may be you should insert space/blank line (not sure).

  • In wiener_index, change
           .. [KRG96b] S. Klavzar, A. Rajapakse, and I. Gutman. The Szeged and the
           Wiener index of graphs. *Applied Mathematics Letters*, 9(5):45--49,
           1996.
    

to

       .. [KRG96b] S. Klavzar, A. Rajapakse, and I. Gutman. The Szeged and the
         Wiener index of graphs. *Applied Mathematics Letters*, 9(5):45--49,
         1996.

comment:21 Changed 7 years ago by git

  • Commit changed from 38c10d43143d985adbb57fcb514f3483ee30941d to 760ea8746d52a8d9fbfb22621ae7063ab93c2ceb

Branch pushed to git repo; I updated commit sha1. New commits:

3612655Temp
760ea87Refactored shortest paths (first working draft)

comment:22 Changed 7 years ago by borassi

  • Status changed from new to needs_review

Finally, I have a version of the refactoring that might be reviewed. Hope you like it!

comment:23 Changed 7 years ago by git

  • Commit changed from 760ea8746d52a8d9fbfb22621ae7063ab93c2ceb to 5c7856142001699dbe00ad659a0ea95b66bedbf8

Branch pushed to git repo; I updated commit sha1. New commits:

793dc12Merged with 6.9.beta0
5c78561Small correction in doctest

comment:24 Changed 7 years ago by dcoudert

  • Status changed from needs_review to needs_work

Hello,

The patch passes all long tests, but I have a problem when building the documentation (duplicate citation KRG96b). I don't know which is the best solution. You could eventually point to the other module documentation instead of repeating citation.

[graphs   ] /Users/dcoudert/sage/local/lib/python2.7/site-packages/sage/graphs/generic_graph.py:docstring of sage.graphs.generic_graph.GenericGraph.wiener_index:72: WARNING: duplicate citation KRG96b, other instance in /Users/dcoudert/sage/src/doc/en/reference/graphs/sage/graphs/distances_all_pairs.rst
Error building the documentation.
Traceback (most recent call last):
  File "/Users/dcoudert/sage/src/doc/common/builder.py", line 1626, in <module>
    getattr(get_builder(name), type)()
  File "/Users/dcoudert/sage/src/doc/common/builder.py", line 292, in _wrapper
    getattr(get_builder(document), 'inventory')(*args, **kwds)
  File "/Users/dcoudert/sage/src/doc/common/builder.py", line 503, in _wrapper
    x.get(99999)
  File "/Users/dcoudert/sage/local/lib/python/multiprocessing/pool.py", line 558, in get
    raise self._value
OSError: [graphs   ] /Users/dcoudert/sage/local/lib/python2.7/site-packages/sage/graphs/generic_graph.py:docstring of sage.graphs.generic_graph.GenericGraph.wiener_index:72: WARNING: duplicate citation KRG96b, other instance in /Users/dcoudert/sage/src/doc/en/reference/graphs/sage/graphs/distances_all_pairs.rst

make[1]: *** [doc-html] Error 1

David.

comment:25 Changed 7 years ago by git

  • Commit changed from 5c7856142001699dbe00ad659a0ea95b66bedbf8 to 3f371f844ecd2667b268d74fa258787f3c6da1c4

Branch pushed to git repo; I updated commit sha1. New commits:

3f371f8Removed duplicate reference

comment:26 Changed 7 years ago by borassi

  • Status changed from needs_work to needs_review

I removed the duplicate reference: now the citation refers to the other module.

comment:27 Changed 7 years ago by dcoudert

  • Milestone changed from sage-6.8 to sage-6.9
  • Reviewers set to David Coudert
  • Status changed from needs_review to positive_review

Hello,

now the doc build properly and it looks fine. For me this ticket is now good to go.

Best, David.

comment:28 Changed 7 years ago by vbraun

  • Branch changed from u/borassi/refactor_shortest_paths to 3f371f844ecd2667b268d74fa258787f3c6da1c4
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.