Opened 6 years ago

Closed 6 years ago

#18839 closed enhancement (fixed)

Boost Dominator Tree

Reported by: borassi Owned by:
Priority: major Milestone: sage-6.8
Component: graph theory Keywords: Dominator tree, Boost
Cc: ncohen, dcoudert Merged in:
Authors: Michele Borassi Reviewers: Nathann Cohen, David Coudert
Report Upstream: N/A Work issues:
Branch: b42de69 (Commits, GitHub, GitLab) Commit: b42de69ea7631fcc9110e0490205f0d4a09ec90d
Dependencies: #18811, #18564 Stopgaps:

Status badges

Description (last modified by borassi)

Use Boost to compute the dominator tree of a digraph.

This function is not available in Sagemath.

Change History (43)

comment:1 Changed 6 years ago by borassi

  • Authors set to Michele Borassi
  • Cc ncohen dcoudert added
  • Component changed from PLEASE CHANGE to graph theory
  • Dependencies set to 18811, 18564
  • Keywords Dominator tree Boost added
  • Type changed from PLEASE CHANGE to enhancement

comment:2 Changed 6 years ago by borassi

  • Description modified (diff)

comment:3 Changed 6 years ago by ncohen

  • Dependencies changed from 18811, 18564 to #18811, #18564

comment:4 Changed 6 years ago by borassi

  • Branch set to u/borassi/boost_dominator_tree

comment:5 Changed 6 years ago by borassi

  • Commit set to 4d1b5ac6214a68ed2d918443bdac80f039dc2841
  • Status changed from new to needs_review

This is the first draft of the Boost algorithm for computing dominator trees (see https://en.wikipedia.org/wiki/Dominator_%28graph_theory%29 for the definition of dominator tree).


New commits:

283c61b First draft
18bd01a Applied Nathann's suggestions
99bd994 Few corrections in the reference manual
40df844 trac #18811: Review
6c8d49b More reviewer comments
c365d02 Removed "vertices is None" from boost_clustering_coeff
582ce34 Changed for v in G.vertices() => for v in G
6eb8a83 Added a typedef for vertex index (before, it was int).
337726e First draft
4d1b5ac Removed trailing whitespaces
Last edited 6 years ago by borassi (previous) (diff)

comment:6 follow-up: Changed 6 years ago by dcoudert

Nicely done.

May be you could combine methods dominator_tree_dictionary and dominator_tree using an optional parameter (e.g., return_dict=False) ?

Actually I don't know which is the most useful: the tree or the dictionary.

Also, I suggest

- edges = [[v,dom_tree_dict[v]] for v in dom_tree_dict.keys() if dom_tree_dict[v] is not None]
+ edges = [[v,dom_v] for v,dom_v in dom_tree_dict.iteritems() if not dom_v is None]

comment:7 in reply to: ↑ 6 Changed 6 years ago by ncohen

May be you could combine methods dominator_tree_dictionary and dominator_tree using an optional parameter (e.g., return_dict=False) ?

+1

Also, you do not need to write two functions (with doc+test) when everything it does is call another function defined outside, in a module. You will find many such examples at the bottom of generic_graph.py.

Nathann

comment:8 Changed 6 years ago by git

  • Commit changed from 4d1b5ac6214a68ed2d918443bdac80f039dc2841 to a6f3f9accc844816a3fa720eb8abd154c4a3a805

Branch pushed to git repo; I updated commit sha1. New commits:

a6f3f9aReviewer's suggestions

comment:9 Changed 6 years ago by borassi

Done everything!

comment:10 follow-up: Changed 6 years ago by dcoudert

Install OK, docbuild OK, doc looks good.

However, I have questions on the expected behavior:

  • with 1-vertex graph:
    sage: G = graphs.PathGraph(1)
    sage: G.dominator_tree(0)
    {0: None}
    sage: G.dominator_tree(0, return_dict=False)
    Graph on 0 vertices
    
  • With disconnected graphs
    sage: G = 2 * graphs.PathGraph(1)
    sage: G.dominator_tree(0)
    {0: None, 1: None}
    sage: G = 2 * graphs.PathGraph(2)
    sage: G.dominator_tree(0)
    {0: None, 1: 0, 2: None, 3: None}
    

If this is effectively what we expect, then I will finalize the review.

comment:11 Changed 6 years ago by git

  • Commit changed from a6f3f9accc844816a3fa720eb8abd154c4a3a805 to 81e4ee11c8db417b6f5b5ec478da01bdcfd19ab7

Branch pushed to git repo; I updated commit sha1. New commits:

81e4ee1Added support if dominator tree is empty

comment:12 follow-up: Changed 6 years ago by ncohen

Helloooooo again,

Is there any reason why you kept two functions with the same name (dominator_tree)? If you moved the 'graph output' to the one defined in boost_graph.pyx, you would only have to import it in the GenericGraph object. No duplication of doc/doctest.

Nathann

comment:13 in reply to: ↑ 10 Changed 6 years ago by borassi

However, I have questions on the expected behavior:

  • with 1-vertex graph:
    sage: G = graphs.[wiki:PathGraph](1)
    sage: G.dominator_tree(0)
    {0: None}
    sage: G.dominator_tree(0, return_dict=False)
    Graph on 0 vertices
    

You are right, if the dominator tree is empty (that is, if from the root we cannot reach any other vertex) it is better to output a 1-vertex graph containing the root, not an empty graph. For the dictionary, I think the behavior is correct.

  • With disconnected graphs
    sage: G = 2 * graphs.[wiki:PathGraph](1)
    sage: G.dominator_tree(0)
    {0: None, 1: None}
    sage: G = 2 * graphs.[wiki:PathGraph](2)
    sage: G.dominator_tree(0)
    {0: None, 1: 0, 2: None, 3: None}
    

If this is effectively what we expect, then I will finalize the review.

Here, the behavior is correct, in my opinion. I have changed the doc to better explain this case, by adding an "OUTPUT" block.

comment:14 Changed 6 years ago by git

  • Commit changed from 81e4ee11c8db417b6f5b5ec478da01bdcfd19ab7 to 7b72043f2af4fc29ece5e6aad5996a9a9a49539a

Branch pushed to git repo; I updated commit sha1. New commits:

7b72043Improved doc (unreachable vertices)

comment:15 in reply to: ↑ 12 Changed 6 years ago by borassi

You mean that I should move all the code in boost_graph.pyx, and then import this function through something like

GenericGraph.dominator_tree = types.MethodType(sage.graphs.base.boost_graph.dominator_tree(GenericGraph, ???))

Could you tell me the exact command? The three examples I have found in generic_graph.py do not have arguments...

Replying to ncohen:

Helloooooo again,

Is there any reason why you kept two functions with the same name (dominator_tree)? If you moved the 'graph output' to the one defined in boost_graph.pyx, you would only have to import it in the GenericGraph object. No duplication of doc/doctest.

Nathann

comment:16 follow-up: Changed 6 years ago by dcoudert

A good example is GenericGraph.diameter = types.MethodType(sage.graphs.distances_all_pairs.diameter, None, GenericGraph). It has several arguments (e.g. method='iFUB').

comment:17 in reply to: ↑ 16 Changed 6 years ago by borassi

Hellooooooo!

I have done it, but there is a problem: in order to obtain this result, generic_graph should import boost_graph, which imports directed_graph, which imports generic_graph, and I have cyclic dependencies. I tried to put

import types
from sage.graphs.generic_graph import GenericGraph
GenericGraph.dominator_tree = types.MethodType(dominator_tree, None, GenericGraph)

inside boost_graph, but it is not imported when Sage starts, so I have to import boost_graph before running dominator_tree (and I don't like it). Any suggestion?

Replying to dcoudert:

A good example is GenericGraph.diameter = types.MethodType(sage.graphs.distances_all_pairs.diameter, None, GenericGraph). It has several arguments (e.g. method='iFUB').

comment:18 follow-up: Changed 6 years ago by dcoudert

  • Reviewers set to Nathann Cohen, David Coudert

This is weird since for instance the distances_all_pairs module imports Graph many times.

Could you try the following: in the boost_graph.pyx file, move the from sage.graphs.graph import Graph statement inside all the methods using it. Same for DiGraph etc.

David.

comment:19 Changed 6 years ago by git

  • Commit changed from 7b72043f2af4fc29ece5e6aad5996a9a9a49539a to ec55207fb866c9a92d51fc1270690bf8fdfe8055

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

344e41bBoost dominator tree
ec55207Removed trailing whitespaces

comment:20 in reply to: ↑ 18 Changed 6 years ago by borassi

Helloooooo!

Thank you very much: it worked! Now, I have rebased all my commits, so that the history of generic_graph is correct, and I have moved everything to boost_graph.

Replying to dcoudert:

This is weird since for instance the distances_all_pairs module imports Graph many times.

Could you try the following: in the boost_graph.pyx file, move the from sage.graphs.graph import Graph statement inside all the methods using it. Same for DiGraph etc.

David.

comment:21 Changed 6 years ago by dcoudert

Something goes wrong with your last commits, some conflicts. I don't know how to solve that (not git expert).

comment:22 Changed 6 years ago by borassi

Strange, in my computer it works correctly... Maybe you have to perform a forced pull, since I had to perform a forced push, or try to remove your local branch and re-download it... Nathann?

comment:23 Changed 6 years ago by ncohen

I also see conflicts. And the branch's name of thi ticket appears in red, which is a sign too.

Nathann

comment:24 Changed 6 years ago by ncohen

Michele: are you running beta7? If not, that's the reason.

Nathann

comment:25 Changed 6 years ago by git

  • Commit changed from ec55207fb866c9a92d51fc1270690bf8fdfe8055 to 20a8625d9085eea4a3851471d2df2aa59e76703b

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

20a8625Dominator tree through Boost

comment:26 Changed 6 years ago by borassi

Now it should work: I have downloaded beta7, and I have added the missing parts.

comment:27 follow-up: Changed 6 years ago by dcoudert

  • Status changed from needs_review to needs_info

Now it's working and I'm trying to understand the expected behavior of the method.

  • is it normal that the resulting tree is always a star ?
    sage: G = graphs.RandomBarabasiAlbert(1000,2)
    sage: H = G.dominator_tree(0, return_dict=False)
    sage: H.is_tree() and H.diameter()<=2
    True
    sage: H = G.dominator_tree(193, return_dict=False)
    sage: H.is_tree() and H.diameter()<=2
    True
    
  • I suggest to set parameter return_dict=False by default. Well I don't know which is the most useful
  • There is a labeling problem with the root.
    sage: G = graphs.Grid2dGraph(4,4)
    sage: G.dominator_tree((0,0), return_dict=False).vertices()
    [0,
     (0, 1),
     (0, 2),
     (0, 3),
     (1, 0),
     (1, 1),
     (1, 2),
     (1, 3),
     (2, 0),
     (2, 1),
     (2, 2),
     (2, 3),
     (3, 0),
     (3, 1),
     (3, 2),
     (3, 3)]
    sage: G.dominator_tree((1,1), return_dict=False).vertices()
    [5,
     (0, 0),
     (0, 1),
     (0, 2),
     (0, 3),
     (1, 0),
     (1, 2),
     (1, 3),
     (2, 0),
     (2, 1),
     (2, 2),
     (2, 3),
     (3, 0),
     (3, 1),
     (3, 2),
     (3, 3)]
    
  • Try also this. The output digraph is not connected and you have some labeling problems.
    sage: D = digraphs.DeBruijn(2,4)
    sage: D.dominator_tree('0000', return_dict=False).show()
    

David.

comment:28 Changed 6 years ago by git

  • Commit changed from 20a8625d9085eea4a3851471d2df2aa59e76703b to 512cfdf27fc1bd53f0bc3e06ba032fd23895ac01

Branch pushed to git repo; I updated commit sha1. New commits:

512cfdfReviewer's comments

comment:29 in reply to: ↑ 27 ; follow-up: Changed 6 years ago by borassi

  • Status changed from needs_info to needs_review

Replying to dcoudert:

Now it's working and I'm trying to understand the expected behavior of the method.

  • is it normal that the resulting tree is always a star ?
    sage: G = graphs.[wiki:RandomBarabasiAlbert](1000,2)
    sage: H = G.dominator_tree(0, return_dict=False)
    sage: H.is_tree() and H.diameter()<=2
    True
    sage: H = G.dominator_tree(193, return_dict=False)
    sage: H.is_tree() and H.diameter()<=2
    True
    

Yes, it is: if the input is undirected, the dominator tree is a star if and only the if the graph is biconnected. In a Barabasi-Albert graph, the minimum degree is 2, so I believe it is.

  • I suggest to set parameter return_dict=False by default. Well I don't know which is the most useful

Done!

  • There is a labeling problem with the root.
    sage: G = graphs.Grid2dGraph(4,4)
    sage: G.dominator_tree((0,0), return_dict=False).vertices()
    [0,
    (0, 1),
    (0, 2),
    (0, 3),
    (1, 0),
    (1, 1),
    (1, 2),
    (1, 3),
    (2, 0),
    (2, 1),
    (2, 2),
    (2, 3),
    (3, 0),
    (3, 1),
    (3, 2),
    (3, 3)]
    sage: G.dominator_tree((1,1), return_dict=False).vertices()
    [5,
    (0, 0),
    (0, 1),
    (0, 2),
    (0, 3),
    (1, 0),
    (1, 2),
    (1, 3),
    (2, 0),
    (2, 1),
    (2, 2),
    (2, 3),
    (3, 0),
    (3, 1),
    (3, 2),
    (3, 3)]
    
  • Try also this. The output digraph is not connected and you have some labeling problems.
    sage: D = digraphs.[wiki:DeBruijn](2,4)
    sage: D.dominator_tree('0000', return_dict=False).show()
    

Yes, you are right, I did not "translate" properly the Boost vertices into Sage labels. I have corrected that line and I have added a test to check that the translation is correct. The new line is:

edges = [[int_to_vertex[result[vertex_to_int[v]]], v] for v in g.vertices() if result[vertex_to_int[v]] != no_parent]

Thank you very much!

comment:30 in reply to: ↑ 29 ; follow-up: Changed 6 years ago by dcoudert

Yes, it is: if the input is undirected, the dominator tree is a star if and only the if the graph is biconnected. In a Barabasi-Albert graph, the minimum degree is 2, so I believe it is.

So for undirected graphs, we can first test if is it connected. If true, then we can directly construct and return the star (or the dict) without using boost, right? If False, then what's the output ?

What for directed graphs?

David.

comment:31 in reply to: ↑ 30 ; follow-up: Changed 6 years ago by borassi

Hellooooo!

Replying to dcoudert:

So for undirected graphs, we can first test if is it connected. If true, then we can directly construct and return the star (or the dict) without using boost, right? If False, then what's the output ?

Well, not connected but biconnected. In any case, probably the dominator tree in the undirected case can be found by computing biconnected components and cut vertices. There is a linear algorithm for biconnected components, but it is not very easy to implement, and the Boost algorithm is O(m log m), which is not much worse than O(m). Is the linear algorithm for BCC implemented in the hyperbolicity algorithm?

What for directed graphs?

Well, it is more complicated... I heard that finding linear algorithms for this problem is an open issue, and the Boost algorithm should be optimal. For more information, see the Boost help page http://www.boost.org/doc/libs/1_58_0/libs/graph/doc/lengauer_tarjan_dominator.htm.

Michele

comment:32 in reply to: ↑ 31 ; follow-up: Changed 6 years ago by dcoudert

Replying to borassi:

Hellooooo!

Replying to dcoudert:

So for undirected graphs, we can first test if is it connected. If true, then we can directly construct and return the star (or the dict) without using boost, right? If False, then what's the output ?

Well, not connected but biconnected.

OK, I see. Could you mention this in the description of the method? and perhaps add a simple example of a graph with 2 biconnected components. For Instance

sage: G = 2*graphs.CycleGraph(3)
sage: G.add_edge(0,3)
sage: G.dominator_tree(0, return_dict=True)
{0: None, 1: 0, 2: 0, 3: 0, 4: 3, 5: 3}

In any case, probably the dominator tree in the undirected case can be found by computing biconnected components and cut vertices. There is a linear algorithm for biconnected components, but it is not very easy to implement, and the Boost algorithm is O(m log m), which is not much worse than O(m). Is the linear algorithm for BCC implemented in the hyperbolicity algorithm?

For undirected graphs, we have B,C = G.blocks_and_cut_vertices(), where B is the list of blocks (list of vertices of each biconnected component) and C is the list of cut vertices. It's a Python implementation, but it's efficient. We use it for the hyperbolicity and many other algorithms.

However, now that I have a better understanding of the method, I think it is better to let it as it is. So improve the doc and it should be ok.

David.

comment:33 Changed 6 years ago by git

  • Commit changed from 512cfdf27fc1bd53f0bc3e06ba032fd23895ac01 to 23b1952b8b3cbca30aeafdcb4242aa5261cd4cb8

Branch pushed to git repo; I updated commit sha1. New commits:

23b1952Improved documentation

comment:34 in reply to: ↑ 32 Changed 6 years ago by borassi

However, now that I have a better understanding of the method, I think it is better to let it as it is. So improve the doc and it should be ok.

Done!

comment:35 Changed 6 years ago by dcoudert

  • Status changed from needs_review to positive_review

For me the patch is now good to go!

comment:36 follow-up: Changed 6 years ago by vbraun

  • Status changed from positive_review to needs_work

Fails on 32-bit

sage -t --long src/sage/graphs/base/boost_graph.pyx
**********************************************************************
File "src/sage/graphs/base/boost_graph.pyx", line 101, in sage.graphs.base.boost_graph.edge_connectivity
Failed example:
    edge_connectivity(g)
Expected:
    [4, [(0, 1), (0, 2), (0, 3), (0, 4)]]
Got:
    [4L, [(0L, 1L), (0L, 2L), (0L, 3L), (0L, 4L)]]
**********************************************************************
1 item had failures:

comment:37 Changed 6 years ago by git

  • Commit changed from 23b1952b8b3cbca30aeafdcb4242aa5261cd4cb8 to 010f213e4ba1b41b6655e89331a631b810eec82d

Branch pushed to git repo; I updated commit sha1. New commits:

010f213Changed default vertex type to int

comment:38 in reply to: ↑ 36 Changed 6 years ago by borassi

  • Status changed from needs_work to needs_review

I think the problem is that on 32bit the C++ unsigned int is converted to a long. I replaced unsigned int with int for vertices, and now it should work. How can I test it on 32bit?

comment:39 Changed 6 years ago by dcoudert

To test on 32bits, you need a 32bits cpu. May be one of the machines of the patchbot is 32bits and so you will get the answer soon. D.

comment:40 Changed 6 years ago by git

  • Commit changed from 010f213e4ba1b41b6655e89331a631b810eec82d to b42de69ea7631fcc9110e0490205f0d4a09ec90d

Branch pushed to git repo; I updated commit sha1. New commits:

b42de69Modified the Boost interface

comment:41 Changed 6 years ago by borassi

Helloooooo!

I have made a small modification in the Boost interface: now the Boost graph "knows" (through a vertex_property) the number associated to each vertex. This way, if the result is an vector, where in position i there is a property of vertex i, this vector is generated directly from Boost, without conversion.

However, after this modification, I think this patch needs a new review, at least for this part.

Thank you very much!

Michele

PS: I still have no idea on how to test 32bit Sagemath, and I hope modifying the vertex name from unsigned int to int worked! Anyone is able to test it?

Last edited 6 years ago by borassi (previous) (diff)

comment:42 Changed 6 years ago by dcoudert

  • Status changed from needs_review to positive_review

For me the patch is good to go. I tried the patch successfully using an old computer on which I can only use cplex 32bits and not cplex 64bits. So I assume it is 32bits... I cannot do better test.

comment:43 Changed 6 years ago by vbraun

  • Branch changed from u/borassi/boost_dominator_tree to b42de69ea7631fcc9110e0490205f0d4a09ec90d
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.