Opened 9 years ago
Closed 8 years ago
#16475 closed defect (fixed)
Bug in GomoryHu tree algorithm
Reported by:  foosterhof  Owned by:  

Priority:  major  Milestone:  sage6.6 
Component:  graph theory  Keywords:  gomory hu tree gomoryhu gomory_hu_tree 
Cc:  Rudi  Merged in:  
Authors:  Nathann Cohen  Reviewers:  Michele Borassi 
Report Upstream:  N/A  Work issues:  
Branch:  b0bbcd4 (Commits, GitHub, GitLab)  Commit:  b0bbcd4119209e2b0e158ac3f83a5d2068009993 
Dependencies:  #12797  Stopgaps: 
Description
When trying to come up with a doctest to verify ticket #16404, I stumbled across this error:
sage: G = graphs.PetersenGraph() sage: for u, v in [(0, 1), (0, 4), (0, 5), (1, 2), (1, 6), (3, 4), (5, 7), (5, 8)]: ....: G.set_edge_label(u, v, 2) ....: T = G.gomory_hu_tree(method="FF") ....: f = G.flow(1, 9, use_edge_labels = True) ....: f2 = T.edge_label(1, 9) ....: if f != f2: ....: print 'Proper Tree Edge Error:', f, f2 ....: Proper Tree Edge Error: 3 6
This is a violation of the property that defines a GomoryHu tree.
Furthermore, the derived property that the flow between two vertices in a graph is the minimum of the edge_labels in the constructed GomoryHu tree can be violated, for instance in the above graph:
sage: f = G.flow(5, 1, use_edge_labels = True) sage: P = T.shortest_path(5, 1) sage: E = zip(P, P[1::]) sage: f2 = min(map(lambda x: T.edge_label(x[0], x[1]), E)) sage: if f != f2: ....: print 'Tree Path Error:', f, f2, P ....: Tree Path Error: 6 3 [5, 9, 1]
Change History (22)
comment:1 Changed 9 years ago by
Branch:  → u/foosterhof/ticket/16475 

Created:  Jun 12, 2014, 8:40:17 AM → Jun 12, 2014, 8:40:17 AM 
Modified:  Jun 12, 2014, 8:40:17 AM → Jun 12, 2014, 8:40:17 AM 
comment:2 Changed 9 years ago by
Commit:  → 27025d1708e3bc7c0589b266bbf1f2f143e6b97a 

Status:  new → needs_review 
comment:3 Changed 9 years ago by
Commit:  27025d1708e3bc7c0589b266bbf1f2f143e6b97a → 9326d1518c32d30978e7b4ce9e39720f1bf8083d 

Branch pushed to git repo; I updated commit sha1. New commits:
9326d15  Added doctest

comment:4 Changed 9 years ago by
Dependencies:  → 12797 

This fix works correctly, as far as I tested, when ticket #12797 is applied as well.
Florian
comment:5 Changed 9 years ago by
Hello !
Just some cosmetic remarks :
 Set is *MUCH* slower than set or frozenset, so if you don't need Set for a specific reason don't use it:
sage: %timeit Set(range(50)) 10000 loops, best of 3: 29.1 µs per loop sage: %timeit frozenset(range(50)) 100000 loops, best of 3: 2.05 µs per loop
 It would be cool if you could add some comments about what the following block of code does. Just a line or 5 words, i.e. something to say with words what is happening there
+ C = {} ... + C[r] = C2[r]
Thanks for fixing that, btw :)
Nathann
comment:6 Changed 9 years ago by
Dependencies:  12797 → #12797 

comment:7 Changed 9 years ago by
Commit:  9326d1518c32d30978e7b4ce9e39720f1bf8083d → e4604fa9ba9cf8ef5475620ed6b3d156402015d6 

Branch pushed to git repo; I updated commit sha1. New commits:
e4604fa  Added comment on goal and use of Partition code

comment:8 followup: 9 Changed 9 years ago by
Hi
I am trying to convert to use your suggestion, using frozenset, but the call g1.vertices() (and thus also g2.vertices()) complains that it cannot compare a (frozen)set to an int, as it will sort them. I have been trying to give it a nice key lambda function to circumvent this problem, but no luck so far. Any suggestions? Or does this mean that I can better just leave it as using Set?
I have added a comment, including a reference to a piece of a book by A. Schrijver, which explains the use of the partitions. (A GomoryHu tree can be defined for a specific subset R of V, in which case these partitions play a big role)
New commits:
e4604fa  Added comment on goal and use of Partition code

comment:9 Changed 9 years ago by
Hello !
I am trying to convert to use your suggestion, using frozenset, but the call g1.vertices() (and thus also g2.vertices()) complains that it cannot compare a (frozen)set to an int, as it will sort them.
HMmmmmmmmm O_o
I do not understand what you mean. Could you give me an example of code that produces it ? It looks that you cannot use "&" on a frozenset, but it has an "intersection" method
sage: range(5) & frozenset(range(8)) ... TypeError: unsupported operand type(s) for &: 'list' and 'frozenset' sage: frozenset(range(8)).intersection(range(5)) frozenset({0, 1, 2, 3, 4})
I have added a comment, including a reference to a piece of a book by A. Schrijver, which explains the use of the partitions. (A GomoryHu tree can be defined for a specific subset R of V, in which case these partitions play a big role)
Thanks ! :)
Nathann
comment:10 followup: 11 Changed 9 years ago by
In Graph.vertices():
if not boundary_first: return sorted(list(self.vertex_iterator()), key=key)
Apparently, it is possible to compare Set and int (or sage.blabla.Integer) objects, but not (frozen)set and int.
sage: Set([252,35]) < 1 True sage: set([252,35]) < 1  TypeError Traceback (most recent call last) <ipythoninput16ea79432632b4> in <module>() > 1 set([Integer(2352),Integer(2352)]) < Integer(1) TypeError: can only compare to a set sage: frozenset([2352,2352]) < 1  TypeError Traceback (most recent call last) <ipythoninput1755b270cfd9be> in <module>() > 1 frozenset([Integer(2352),Integer(2352)]) < Integer(1) TypeError: can only compare to a set
And thus, when doing:
R1 = vertices & frozenset(g1.vertices())
it gives a TypeError?, as g1 contains a vertex that is a frozenset, namely g1_v:
g1_v = frozenset(set2) g2_v = frozenset(set1) g1.add_vertex(g1_v) g2.add_vertex(g2_v)
Florian
EDIT:
This can be fixed very easily, but also more efficiently. First of all, we can pass g1.vertex_iterator() to frozenset(), which will probably solve it all together.
Second of all, we know exactly which vertices g1 has, namely all vertices in set1, and the extra vertex g1_v. None more. Since g1_v is not in the variable vertices, we can just pass set1 to frozenset(), and analogously set2 for g2.
EDIT2:
Ok, so apparently Graph._backend.iterator_edges() also compares vertices before yielding them, to make sure the 'lesser' one is reported as the first vertex. And _ford_fulkerson uses this function through Graph.edge_iterator().
I guess there is no way around using a Set?
comment:11 Changed 9 years ago by
Oh. I see ! Sorry, I had not realized that the problem came from frozensets inside of the graph !
This can be fixed very easily, but also more efficiently. First of all, we can pass g1.vertex_iterator() to frozenset(), which will probably solve it all together.
Right, but that's going too far I guess.
Ok, so apparently Graph._backend.iterator_edges() also compares vertices before yielding them, to make sure the 'lesser' one is reported as the first vertex. And _ford_fulkerson uses this function through Graph.edge_iterator().
Yepyep. And g.vertices() sorts the vertices, which is not a good idea in itself but has some advantages, still ... :/
I guess there is no way around using a Set?
Ahahaah. As you said there are many ways around, but the best is indeed to use Set. I don't think that it is a performance issue in this context, I was just saying aloud that it's better to avoid Set in general. Let's keep the code as it is :)
Nathann
comment:12 Changed 8 years ago by
Milestone:  sage6.3 → sage6.4 

comment:13 Changed 8 years ago by
Milestone:  sage6.4 → sage6.6 

Status:  needs_review → needs_work 
branch does not apply
comment:14 Changed 8 years ago by
Authors:  → Nathann Cohen 

Branch:  u/foosterhof/ticket/16475 → public/16475 
Commit:  e4604fa9ba9cf8ef5475620ed6b3d156402015d6 → 573257955a6e18241076d9b7030327bf14dcd562 
Status:  needs_work → needs_review 
Hello,
I had totally forgotten this branch, and this is very bad (wrong results) (EDIT:see below). I looked at the code this weekend and fixed it (the fix actually takes one line ...), and also simplified the function a bit.
As the fix actually avoids adding more code to the function, I propose it instead, and change the branch (I hope that nobody minds). Needs review again.
Nathann
(EDIT 16/05/2015) This sentence is misleading: I meant that the *bug* is very bad, and that forgetting it is dangerous since Sage will return wrong results for as long as it is not fixed.
New commits:
df60f70  trac #16475: Move connectivity test to the main function

005e77c  trac #16475: A useless case

ebc6504  trac #16475: simplify handling of edge labels (capacities)

70ca162  trac #16475: slightly Simplify update of edge labels (capacities)

18c6493  trac #16475: Set>frozenset

f8b054e  trac #16475: actual bugfix

be53d23  trac #16475: Documentation and renamed variables

5732579  trac #16475: Bibliographical reference

comment:15 followup: 16 Changed 8 years ago by
Hello,
First of all, you say that my code was very bad, because it gave wrong results. Could you tell me why? I see no counterexample or anything to back up your claim.
Of course I dont mind any refactoring, and actually I think it is indeed much more readable, and even more efficient, but there is something bugging me:
In the bibliographical reference you give (which is actually what I used as well), the proof to show that a GomoryHu tree exists uses a Partition { C_r  r \in R }, which dictates how the two inductionobtained trees should be merged. This is what I tried to add to the code with my patch, but you have completely removed it. Why? You simply put the edge between the two trees to be {u, v} where u and v are the picked vertices. However, why would Schrijver put such rigor in proving that this tree obtained using the partition is correct, and not instead simply use 2 arbitrary vertices.
You say you have fixed the algorithm by a single line of code, however I still think that it might give incorrect results in some cases, for reasons above. Would you agree?
Florian
comment:16 followup: 17 Changed 8 years ago by
Helloooooooooooo,
First of all, you say that my code was very bad, because it gave wrong results. Could you tell me why? I see no counterexample or anything to back up your claim.
Oh no sorry, that's a misunderstanding! I just reread my comment and indeed I see how you could have understood it this way. I will edit it in a second. What I meant to say is "I had forgotten about this *bug*, and it is very bad (that I forgot it) because meanwhile Sage is returning wrong results".
I have found nothing wrong with your code. Actually, as I began reading the proof that you followed, I merely noticed that it could be done in a simpler way. Which is why I proposed this alternative implementation, hoping that you would not mind.
In the bibliographical reference you give (which is actually what I used as well), the proof to show that a GomoryHu tree exists uses a Partition { C_r  r \in R }, which dictates how the two inductionobtained trees should be merged. This is what I tried to add to the code with my patch, but you have completely removed it. Why? You simply put the edge between the two trees to be {u, v} where u and v are the picked vertices. However, why would Schrijver put such rigor in proving that this tree obtained using the partition is correct, and not instead simply use 2 arbitrary vertices.
What the current branch does is apply Lemma 15.14alpha (p249 in my edition of it). While I obviously cannot tell you why Schrijver wrote the proof the way he did, I suspect that it is because he wants to *prove* the algorithm while I only want to implement it.
Right now, the code applies Lemma 15.14alpha while *assuming* that every graph has a gomoryhu Tree. Schrijver cannot do this for the very obvious reasons that he wants to *prove* this existence result.
Assuming the existence of a gomoryhu tree, I apply Lemma 15.14alpha to two points to obtain a bipartition U,V
of the graph, then successively contract U and V to obtain two 'reduced' graphs. In each graph you will find 'two kinds' of vertices, i.e.:
 "real vertices from the original graphs" and
 "fake vertices that only appear because of the merge operation"
With Lemma 15.14alpha, what we know is that the max flow between any two "real points" u,v of a specific "new" graph is equal to their max flow in the original graph. Consequently, the three that is being built must indeed be the gomoryhu tree (that we know exists).
You say you have fixed the algorithm by a single line of code, however I still think that it might give incorrect results in some cases, for reasons above. Would you agree?
Do you believe that the way it is done is wrong? Or do you believe that it is suspect of being wrong because it is a simpler algorithm than the one given by Schrijver's proofs?
In my experience, writing the proof of an algorithm requires much more theoretical machinery than what is actually needed when one implements it. Sometimes, it is even easier to make the (theoretical) algorithm a bit slower in order to make the explanation simpler. Very recently, I had to give a formal proof of an algorithm needed for some research problem: while it was easy to explain on a table, it was so complicated to explain it on paper that I started trying to change the algorithm itself to make the proof easier.
Sorry for the misunderstanding, and thank you for your attention with respect to this issue.
Nathann
comment:17 followup: 19 Changed 8 years ago by
Hello!
I have analyzed the proofs in Schrijver: foosterhof is right, because you cannot take two arbitrary vertices, but the code is still correct, because the vertices taken are exactly the vertices u,v used in computing the initial uv cut (these vertices have no name in Schrijver). We could then modify the proof of Theorem 15.14 in Schrijver by using u as r' and v as r'' (r' and r'' are the extremal points of the "new" arc added). In the last sentence of the proof, Lemma 15.14alpha yields the same conclusion for arcs e \neq r'r'', while for the arc r'r'', the two components of the tree obtained by removing r'r'' correspond to a minimum r'r'' cut because we have chosen u=r' and v=r''.
I have no clue on why Schrijver did not use this proof, which looks much simpler (at least to me). If you have any doubts, I can write the whole proof more formally.
Other (small) issues:
 in the comments, I'd use "maximum flow" instead of "maximal flow": in my opinion, "maximal" means that there might be other elements that are not comparable (for instance, C is a maximal clique if there is no clique C' \supset C, while it is a maximum clique if it is a clique of maximum cardinality).
 reading the code, I could not find Schrijver's reference: I think that adding this reference might be a good idea.
 do you think we should provide a proof that our "simplification" of Schrijver's algorithm works?
Michele
comment:18 Changed 8 years ago by
Commit:  573257955a6e18241076d9b7030327bf14dcd562 → b0bbcd4119209e2b0e158ac3f83a5d2068009993 

comment:19 followup: 21 Changed 8 years ago by
Hello !
 in the comments, I'd use "maximum flow" instead of "maximal flow"
I think that there is no confusion possible. I changed it anyway.
 reading the code, I could not find Schrijver's reference: I think that adding this reference might be a good idea.
It appears in the documentation of Graph.gomory_hu_tree
, right before the 'INPUT' block.
 do you think we should provide a proof that our "simplification" of Schrijver's algorithm works?
As you please. Just remember that for as long as this ticket is not reviewed, Sage returns wrong answers. Also, Wikipedia contains a description of some algorithm to compute a GomoryHu tree (I did not read it). If it is the same then that's cool, and if ours is simpler then it is probably where the proof should be written.
http://en.wikipedia.org/wiki/Gomory%E2%80%93Hu_tree
Nathann
comment:20 Changed 8 years ago by
Reviewers:  → Michele Borassi 

Status:  needs_review → positive_review 
comment:21 Changed 8 years ago by
Hello!
 in the comments, I'd use "maximum flow" instead of "maximal flow"
I think that there is no confusion possible. I changed it anyway.
Ok :)
 reading the code, I could not find Schrijver's reference: I think that adding this reference might be a good idea.
It appears in the documentation of
Graph.gomory_hu_tree
, right before the 'INPUT' block.
Sorry, I didn't find it!
 do you think we should provide a proof that our "simplification" of Schrijver's algorithm works?
As you please. Just remember that for as long as this ticket is not reviewed, Sage returns wrong answers. Also, Wikipedia contains a description of some algorithm to compute a GomoryHu tree (I did not read it). If it is the same then that's cool, and if ours is simpler then it is probably where the proof should be written.
I agree: this is not the place where the proof should be written. I think the code is fine as it is!
comment:22 Changed 8 years ago by
Branch:  public/16475 → b0bbcd4119209e2b0e158ac3f83a5d2068009993 

Resolution:  → fixed 
Status:  positive_review → closed 
The problem with the algorithm was that after the recursion, an edge was added. This edge was added to just any 2 vertices before, but there is are 2 specific vertices to which this edge should be added. These vertices are now calculated correctly (hopefully)
New commits:
Fixed GomoryHu tree algorithm by adding the partitions