Opened 8 years ago

Last modified 6 years ago

#13380 needs_info enhancement

Suurballe-Tarjan algorithm for pair of disjoint st-paths

Reported by: dcoudert Owned by: jason, ncohen, rlm
Priority: major Milestone: sage-6.4
Component: graph theory Keywords: graph, disjoint paths
Cc: ncohen Merged in:
Authors: David Coudert Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

This patch implements the Suurballe-Tarjan algorithm for computing a pair of edge or vertex disjoint st-paths of minimum total cost in a (Di)Graph. The cost is either the number of edges, or the sum of the weights of the edges of the paths.

Attachments (3)

trac_13380_suurballe.patch (18.5 KB) - added by dcoudert 8 years ago.
trac_13380-review.patch (13.2 KB) - added by ncohen 8 years ago.
trac_13380-rev2.patch (8.0 KB) - added by dcoudert 8 years ago.

Download all attachments as: .zip

Change History (14)

comment:1 Changed 8 years ago by dcoudert

  • Keywords graph disjoint paths added
  • Status changed from new to needs_review

Changed 8 years ago by dcoudert

comment:2 Changed 8 years ago by dcoudert

After discussions with Nathann, the suurballe tarjan function is now hidden. It is now called by the edge_disjoint_paths, vertex_disjoint_paths, and disjoint_routed_paths when appropriate.

Changed 8 years ago by ncohen

comment:3 follow-up: Changed 8 years ago by ncohen

  • Status changed from needs_review to needs_info

Helloooooo David !!!

  • > This method is designed for simple (Di)Graph. If the input (Di)Graph has multiple edges, this method uses only the edge of minimum weight. This is not a good solution, for maybe the two paths giving the shortest length need both edges. If this cannot be fixed, it is safer to refuse graphs with multiple edges.
  • A big block of the code is dedicated to the many ways input may be formatted. I think that it's really going too far, especially when it's only done in this function and no other. Perhaps we should try to define a general way to deal with this in the graphs methods, but honestly when input is just "a numerical value per edge", what is wrong with asking the user to put those values on the edges ? If the user cannot convert strings to integers before giving his graph to a flow function, he does not deserve to solve a flow problem :-P
  • You added a call to _suurballe_tarjan in many functions, among other disjoint_routed_paths. It breaks things : in particular, [self.subgraph(p1), self.subgraph(p2)] if p1 and p2 else [] return the two graphs induced by the vertices of the irst path, and the vertices of the second paths. There is no reason why these graphs should be induced paths, and the user would not know what to make of the output unless each graph is a path. Actually, the first graph is a shortest path, so this one at least would be induced.... Unless the graph is weighted. Anyway, the proper behaviour is to call vertex_disjoint_paths (which will call your algorithm later).
show(graphs.PetersenGraph().disjoint_routed_paths([(0,1)]*2))
  • You added in edge_disjoint_paths and vertex_disjoint_paths a parameter k indicating "the maximum number of paths to return". Considering how it is implemented (Sage actually solves the max flow problem without taking k into account), I don't really see the point as it's more or less the same amount of work that Sage has to do whether k is defined or not. If it made sense somehow, what about setting k to be *the (exact) number of paths to return* ? I can think of many cases when I would need to get k paths between two vertices, but much fewer where I would need to have "anything between 0 and k". And I would prefer to have them all if there is no difference in the running time anyway.
  • By the way, it is nice to advertise your algorithm but let's write something clean : you test whether you can solve the problem with your new input parameter k=2, and if it does not work just run the former algorithm. Well, if it does not work it means that the graphs is not 2-connected, so you can immediately return a shortest path or say that the graph is disconnected. ou alrady obtained an information that can be put to good use. But then it depends on what k is in the end : the "maximum" number of disjoint paths, or the exact number.
  • There is a problem with vertex_disjoint_path which I just noticed, but it existed before your patch. Actually, your patch slightly improved it, for it was an infinite loop which became "an incoherent result"
sage: graphs.PetersenGraph().vertex_disjoint_paths(0,1)
[[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]

The attached patch fixes some of the above, and other details too..

Nathann

comment:4 Changed 8 years ago by ncohen

Arg... I copied/paste an old version of one paragraph :

  • By the way, it is nice to advertise your algorithm but let's write something clean : you test whether you can solve the problem with your new input parameter k=2, and if it does not work just run the former algorithm. Well, if it does not work it means that the graphs is not 2-connected, so you can immediately return a shortest path or say that the graph is disconnected. You alrady obtained an information that can be put to good use, it's a waste not to do it. But then it depends on what k is in the end : the "maximum" number of disjoint paths, or the exact number.

comment:5 in reply to: ↑ 3 ; follow-up: Changed 8 years ago by dcoudert

  • Reviewers set to Nathann Cohen

Thanks Nathann for the thorough review.

Replying to ncohen:

Helloooooo David !!!

  • > This method is designed for simple (Di)Graph. If the input (Di)Graph has multiple edges, this method uses only the edge of minimum weight. This is not a good solution, for maybe the two paths giving the shortest length need both edges. If this cannot be fixed, it is safer to refuse graphs with multiple edges.

Well, for vertex-disjoint paths the current implementation is valid. So we have to fix the edge-disjoint case. I'm able to fix it for the suurballe_tarjan method, but the behavior of the edge_disjoint_paths method is different (multiple edges not taken into account I suppose).

  • A big block of the code is dedicated to the many ways input may be formatted. I think that it's really going too far, especially when it's only done in this function and no other. Perhaps we should try to define a general way to deal with this in the graphs methods, but honestly when input is just "a numerical value per edge", what is wrong with asking the user to put those values on the edges ? If the user cannot convert strings to integers before giving his graph to a flow function, he does not deserve to solve a flow problem :-P

Well, it depends on the situation. Sometimes you have several labels per edge stored in a dictionary, or you have only one numerical value. We need at least 2 ways to store edge data. Now I agree that I was a bit extreme in the options offered to the user (but strings were useful for Kautz :-P ).

I can certainly reduce, but which are the options to keep ?

  • You added a call to _suurballe_tarjan in many functions, among other disjoint_routed_paths. It breaks things : in particular, [self.subgraph(p1), self.subgraph(p2)] if p1 and p2 else [] return the two graphs induced by the vertices of the irst path, and the vertices of the second paths. There is no reason why these graphs should be induced paths, and the user would not know what to make of the output unless each graph is a path. Actually, the first graph is a shortest path, so this one at least would be induced.... Unless the graph is weighted. Anyway, the proper behaviour is to call vertex_disjoint_paths (which will call your algorithm later).
show(graphs.PetersenGraph().disjoint_routed_paths([(0,1)]*2))

That's right and much better now.

  • You added in edge_disjoint_paths and vertex_disjoint_paths a parameter k indicating "the maximum number of paths to return". Considering how it is implemented (Sage actually solves the max flow problem without taking k into account), I don't really see the point as it's more or less the same amount of work that Sage has to do whether k is defined or not. If it made sense somehow, what about setting k to be *the (exact) number of paths to return* ? I can think of many cases when I would need to get k paths between two vertices, but much fewer where I would need to have "anything between 0 and k". And I would prefer to have them all if there is no difference in the running time anyway.

In some ILP with edge-path formulation, you can restrict the number of considered paths to k for speedup purpose. If we cannot find k paths but k-1, we don't necessarily want to raise an error.

But if you think it make more sense to say exactly k paths, then I will change the functions.

  • (copy from 2nd message) By the way, it is nice to advertise your algorithm but let's write something clean : you test whether you can solve the problem with your new input parameter k=2, and if it does not work just run the former algorithm. Well, if it does not work it means that the graphs is not 2-connected, so you can immediately return a shortest path or say that the graph is disconnected. You alrady obtained an information that can be put to good use, it's a waste not to do it. But then it depends on what k is in the end : the "maximum" number of disjoint paths, or the exact number.

I changed the behavior of the suurballe_tarjan function so that it returns:

  • [], [], +Infty if s and t are in different connected components
  • [a path], [], +Infty if the graph is not 2-connected
  • [a path], [another path], some weight if we can find both paths

Then, in the edge and vertex_disjoint_paths methods, I return directly 0, 1 or 2 paths when k==2, depending on the result.

  • There is a problem with vertex_disjoint_path which I just noticed, but it existed before your patch. Actually, your patch slightly improved it, for it was an infinite loop which became "an incoherent result"
sage: graphs.PetersenGraph().vertex_disjoint_paths(0,1)
[[0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]

The attached patch fixes some of the above, and other details too..

OK.

I let the patch as need_info as long as we have not fixed all points.

Changed 8 years ago by dcoudert

comment:6 in reply to: ↑ 5 ; follow-up: Changed 8 years ago by ncohen

Hello !!!

Well, for vertex-disjoint paths the current implementation is valid. So we have to fix the edge-disjoint case. I'm able to fix it for the suurballe_tarjan method, but the behavior of the edge_disjoint_paths method is different (multiple edges not taken into account I suppose).

Indeed.

sage: g.vertices()
[0, 1]
sage: g.edges()
[(0, 1, None), (0, 1, None)]
sage: g.edge_disjoint_paths(0,1)
[[0, 1]]

Honestly I would sleep much better if those functions return raised an exception on graphs with multiple edges. Dealing with those is a waste of computations on the usual cases, and it will add scores of stupid "if" in the code. I hate this -_-

Well, it depends on the situation. Sometimes you have several labels per edge stored in a dictionary, or you have only one numerical value. We need at least 2 ways to store edge data. Now I agree that I was a bit extreme in the options offered to the user (but strings were useful for Kautz :-P ).

I can certainly reduce, but which are the options to keep ?

Well.. All other graphs methods just accept as input a edge whose label is a weight.. Why would we need anything else ?

Nathann

comment:7 in reply to: ↑ 6 Changed 8 years ago by dcoudert

Honestly I would sleep much better if those functions return raised an exception on graphs with multiple edges. Dealing with those is a waste of computations on the usual cases, and it will add scores of stupid "if" in the code. I hate this -_-

So should I raise an exception in all cases or only for edge-disjoint paths?

Well, it depends on the situation. Sometimes you have several labels per edge stored in a dictionary, or you have only one numerical value. We need at least 2 ways to store edge data. Now I agree that I was a bit extreme in the options offered to the user (but strings were useful for Kautz :-P ).

I can certainly reduce, but which are the options to keep ?

Well.. All other graphs methods just accept as input a edge whose label is a weight.. Why would we need anything else ?

Networkx stores edge weights in a dictionary (default key: 'weight'). I have also examples of graphs where it is convenient to use a dictionary to store weights, labels, colors, etc. So I can let dictionary and numbers and remove other options.

comment:8 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:9 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:10 Changed 6 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:11 Changed 6 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4
Note: See TracTickets for help on using tickets.