Opened 6 years ago
Closed 5 years ago
#18910 closed enhancement (fixed)
Boost minimum spanning tree
Reported by:  borassi  Owned by:  

Priority:  major  Milestone:  sage6.8 
Component:  graph theory  Keywords:  Boost, minimum spanning tree 
Cc:  ncohen, dcoudert, Stefan, Rudi  Merged in:  
Authors:  Michele Borassi  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  38a817d (Commits)  Commit:  38a817d387325a25c62df70c6e77d7c06169f152 
Dependencies:  #18876, #18906  Stopgaps: 
Description (last modified by )
Include Boost implementation of Kruskal and Prim algorithms for minimum spanning tree.
Change History (26)
comment:1 Changed 6 years ago by
 Cc ncohen dcoudert added
 Component changed from PLEASE CHANGE to graph theory
 Dependencies set to #18876, #18906
 Description modified (diff)
 Keywords Boost minimum spanning tree added
 Type changed from PLEASE CHANGE to enhancement
comment:2 Changed 5 years ago by
 Branch set to u/borassi/boost_minimum_spanning_tree
comment:3 Changed 5 years ago by
 Commit set to 8278b0bf8b3b4c86bdc5b543f5ca57c8b682f83a
comment:4 Changed 5 years ago by
 Status changed from new to needs_review
comment:5 Changed 5 years ago by
The weight function you use does not allow to access all possible stuff that might be stored on an edge. For instance, I may have an edge like (u, v, {'weight':3, 'toto':4})
in which case a function like weight_function=lambda e:e[2]['weight']
is possible. With current implementation, you assume that weight_function((u,v))
returns the weight. Better to have weight_function(e)
or weight_function((u,v,l))
.
Also, could you be a bit more specific with the description of the weight function
 ``weight_function``  A function that takes an edge and returns a numeric weight. If ``None`` (default), the algorithm uses the edge weights, if available, otherwise it assigns weight 1 to each edge (in the latter case, the output can be any spanning tree).
In particular, the default behavior assume that edge weights are numbers, that is if e=(u,v,l)
then l
is a number.
David.
comment:6 Changed 5 years ago by
 Commit changed from 8278b0bf8b3b4c86bdc5b543f5ca57c8b682f83a to ceda13f0b83784adce78b527099e08e7e003c6ac
Branch pushed to git repo; I updated commit sha1. New commits:
ceda13f  Improved explanation of weight_function

comment:7 Changed 5 years ago by
Hello!
you are absolutely right: I have made the modification you asked, I have improved the explanation of the weight function, and I have added an example in the generic_graph file.
Thank you very much!
Michele
comment:8 followup: ↓ 9 Changed 5 years ago by
Hello,
I know that dealing with weights is not easy. However, here we have different behaviors.
sage: g = Graph([(0,1,{'blop':1}), (1,2,{'blop':'b'})]) sage: g.min_spanning_tree(algorithm="Prim_Boost", weight_function=lambda e:e[2]['blop'])  TypeError Traceback (most recent call last) ... TypeError: a float is required sage: g = Graph([(0,1,1), (1,2,'b')]) sage: g.min_spanning_tree(algorithm="Prim_Boost", weight_function=lambda e:e[2])  TypeError Traceback (most recent call last) ... TypeError: a float is required sage: g.min_spanning_tree(algorithm="Prim_Boost") [(0, 1, 1), (1, 2, 'b')]
David.
comment:9 in reply to: ↑ 8 Changed 5 years ago by
Hello!
Yeah, I have some problems, too: it is difficult to decide which is the correct behavior. In your examples, I have trouble understanding which improvement you are suggesting: could you be a bit more specific?
In particular, in the first two cases, you provide a weight function which is not correct, so a TypeError? is raised (if you provide a function, the algorithm raises an error if it is not correct). In the third case, your graph is not weighted:
sage: g = Graph([(0,1,1), (1,2,'b')]) sage: g.weighted() False
Hence, the algorithm simply says "Since the graph is not weighted, I ignore the label and set weight 1 to each edge".
Maybe a problem occurs if the graph is weighted and no function is provided: in this case, the algorithm tries to use labels even if they are not numbers. In Boost algorithms, nonnumeric weights are assumed to be 1, while in other algorithms labels are simply compared, and Python decides which is the biggest.
sage: g = Graph([(0,1,1), (1,2,'b')], weighted=True) sage: g.min_spanning_tree(algorithm="Prim_Boost") [(0, 1, 1), (1, 2, 'b')] sage: g.min_spanning_tree(algorithm="Prim_fringe") [(0, 1, 1), (1, 2, 'b')] sage: g.min_spanning_tree(algorithm="Prim_edge") [(0, 1, 1), (1, 2, 'b')] sage: g.min_spanning_tree(algorithm="Kruskal") [(0, 1, 1), (1, 2, 'b')] sage: g.min_spanning_tree(algorithm="Kruskal_Boost") [(0, 1, 1), (1, 2, 'b')] sage: g.min_spanning_tree(algorithm="NetworkX") [(0, 1, 1), (1, 2, 'b')]
In the latter example, do you think I should only improve the explanation, which is in the INPUT section of the documentation and in the examples? Or do you think I should raise an exception if a weight is not a number?
Thank you very much!
Michele
comment:10 Changed 5 years ago by
I think it's a weird behavior to assume that an edge like (1, 2, 'b')
has weight 1. However, it was the original behavior of the method, so I propose to keep it unchanged.
So don't raise exception, but if you can improve further the doc, it should be ok.
Thanks.
comment:11 Changed 5 years ago by
 Commit changed from ceda13f0b83784adce78b527099e08e7e003c6ac to 4bbb2980aab4b0a81fc07cf62afa5395ee7719c1
Branch pushed to git repo; I updated commit sha1. New commits:
4bbb298  Corrected edge weights

comment:12 Changed 5 years ago by
Hello!
I have changed the behavior: now, if edge weights are not convertible to floats through float()
function, an error is raised. I have tried to clarify this behavior as much as possible in the documentation.
Thank you very much!
Michele
comment:13 followup: ↓ 14 Changed 5 years ago by
The documentation builds properly and is now good.
I tracked all calls to min_spanning_tree
to launch tests. All tests pass in graphs and homology.
However, one test is broken in src/sage/matroids/utilities.py
, method lift_cross_ratios
. I suspect that this is due to the change in the output ordering of edges.
I don't know what to do here. I have forced you to sort edges because all methods where not returning edges in the same order, and I think it is better now (the output is independent on the algorithm).
The set T constructed in this method is, before applying this patch:
set([((0, 0), (0, 1), (0, 0)), ((0, 0), (2, 1), (0, 2)), ((0, 1), (1, 0), (1, 0)), ((1, 0), (1, 1), (1, 1)), ((1, 1), (2, 0), (2, 1)), ((0, 0), (3, 1), (0, 3)), ((0, 0), (4, 1), (0, 4))])
and with this patch:
set([((0, 0), (0, 1), (0, 0)), ((0, 0), (2, 1), (0, 2)), ((0, 1), (1, 0), (1, 0)), ((1, 0), (1, 1), (1, 1)), ((2, 0), (2, 1), (2, 2)), ((0, 0), (3, 1), (0, 3)), ((0, 0), (4, 1), (0, 4))])
So only one entry differs: ((1, 1), (2, 0), (2, 1))
becomes ((2, 0), (2, 1), (2, 2))
.
I don't know if we can safely update the expected output of this test or not. Nathann?
comment:14 in reply to: ↑ 13 ; followup: ↓ 15 Changed 5 years ago by
 Cc Stefan Rudi added
Hello,
I don't know if we can safely update the expected output of this test or not. Nathann?
Not sure. Possible, but not sure. The best is to ask the matroid guys. Stefan, Rudi? Does it look okay to you if by changing the behaviour of a graph spanning tree routine, the output of
sage: Z = lift_cross_ratios(A, to_sixth_root_of_unity)
at line 416 of matroids/utilities.py changes like that?
Failed example: Z Expected: [ 1 0 1 1 1] [ 1 1 0 0 z] [ 0 1 z 1 0] Got: [ 1 0 1 1 1] [ 1 1 0 0 z] [ 0 z  1 1 z + 1 0]
It may be caused by a different spanning tree, or by a reordering of its edges.
Nathann
comment:15 in reply to: ↑ 14 ; followup: ↓ 16 Changed 5 years ago by
Replying to ncohen:
Stefan, Rudi? Does it look okay to you if by changing the behaviour of a graph spanning tree routine, the output of
sage: Z = lift_cross_ratios(A, to_sixth_root_of_unity)at line 416 of matroids/utilities.py changes like that?
Failed example: Z Expected: [ 1 0 1 1 1] [ 1 1 0 0 z] [ 0 1 z 1 0] Got: [ 1 0 1 1 1] [ 1 1 0 0 z] [ 0 z  1 1 z + 1 0]It may be caused by a different spanning tree, or by a reordering of its edges.
Yes, that looks OK. This output matrix Z is fixed up to scaling of rows and columns, and the tree is used to determine the scaling. Looks like in this example the bottom row is scaled by z1 (which happens to equal z^{1} in that ring, z is is a sixth root of unity).
So it's safe to just update the output of the test.
comment:16 in reply to: ↑ 15 Changed 5 years ago by
Yes, that looks OK.
Great. Thanks for your answer :)
Nathann
comment:17 Changed 5 years ago by
 Commit changed from 4bbb2980aab4b0a81fc07cf62afa5395ee7719c1 to fd9e2320a51353b9d90879e237869e4971613a5e
Branch pushed to git repo; I updated commit sha1. New commits:
fd9e232  Corrected matroids test

comment:18 Changed 5 years ago by
Done! Thank you very much, Rudi!
comment:19 Changed 5 years ago by
 Reviewers set to David Coudert
 Status changed from needs_review to positive_review
Good!
comment:20 Changed 5 years ago by
 Commit changed from fd9e2320a51353b9d90879e237869e4971613a5e to 66c6aea291d11529833931cc094abd7b35df794c
 Status changed from positive_review to needs_review
Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:
6c084ec  Merge branch 'develop' into t/18876/boost_cuthill_mckee__king_ordering

3847037  Merge branch 'develop' into t/18876/boost_cuthill_mckee__king_ordering

66c6aea  Merge branch 't/18876/boost_cuthill_mckee__king_ordering' into t/18910/boost_minimum_spanning_tree

comment:21 Changed 5 years ago by
I have merged this ticket with the latest version of #18876 to avoid conflicts.
comment:22 followup: ↓ 23 Changed 5 years ago by
When I click on the link to the code (to of the page, branch u/borassi/boost_minimum_spanning_tree
), for file bandwidth.pyx
I see the following line which has been changed in #18876.
+ :meth:`bandwidth_heuristics()<sage.graphs.base.boost_graph.bandwidth_heuristics>`  Uses Boost heuristics to approximate the bandwidth of the input graph
Is it normal??
comment:23 in reply to: ↑ 22 Changed 5 years ago by
Yes, it is normal, I think: I modified that line only after I merged #18876. However, I do not think it is a problem, because the new commit in #18876 should not create conflicts (I have tested the merge).
In any case, I remerged them with this new commit.
Replying to dcoudert:
When I click on the link to the code (to of the page, branch
u/borassi/boost_minimum_spanning_tree
), for filebandwidth.pyx
I see the following line which has been changed in #18876.+ :meth:`bandwidth_heuristics()<sage.graphs.base.boost_graph.bandwidth_heuristics>`  Uses Boost heuristics to approximate the bandwidth of the input graphIs it normal??
comment:24 Changed 5 years ago by
 Commit changed from 66c6aea291d11529833931cc094abd7b35df794c to 38a817d387325a25c62df70c6e77d7c06169f152
comment:25 Changed 5 years ago by
 Status changed from needs_review to positive_review
for me the patch is good to go !
comment:26 Changed 5 years ago by
 Branch changed from u/borassi/boost_minimum_spanning_tree to 38a817d387325a25c62df70c6e77d7c06169f152
 Resolution set to fixed
 Status changed from positive_review to closed
Hello!
This is the first version of Boost minimum spanning tree. With this code, I'm inserting in Sage the code to interface with Boost weighted graphs. The improvement on this specific algorithm is not striking (about 2x over the old algorithm), but I still set the Boost algorithm as default.
Some benchmark:
Hope you like it!
Michele