#23487 closed defect (fixed)
Linear time implementation of Modular Decomposition for Undirected Graphs
Reported by:  jlokesh  Owned by:  

Priority:  major  Milestone:  sage8.1 
Component:  graph theory  Keywords:  modular decomposition, gsoc2017 
Cc:  dimpase, dcoudert  Merged in:  
Authors:  Lokesh Jain  Reviewers:  David Coudert, Dima Pasechnik 
Report Upstream:  N/A  Work issues:  
Branch:  6688bac (Commits, GitHub, GitLab)  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
This is aimed at providing linear time implementation for finding modular decomposition of undirected graphs, fixing the currently broken Sage's graph modular decomposition.
Change History (67)
comment:1 Changed 4 years ago by
 Description modified (diff)
 Keywords GSOC 2017 added
 Type changed from task to defect
comment:2 Changed 4 years ago by
 Keywords gsoc2017 added; GSOC 2017 removed
comment:3 Changed 4 years ago by
 Branch set to u/jlokesh/23487
 Commit set to 340580e66f166455ef36a7e2b5e713949397df1a
New commits:
340580e  trac #23487: added modular decomposition module + removed old one

comment:4 followup: ↓ 5 Changed 4 years ago by
 Reviewers set to David Coudert, Dmitrii Pasechnik
Dear Lokesh,
why have you removed file modular_decomposition.pyx
? You should put it back for the moment. There are dependencies and so I'm unable to compile sage with your patch.
Many improvements are needed in your code that we will slowly address. Dima is certainly more expert than me in this. For instance:
 Comments must be in 80 column mode. Some editors like emacs ease the formatting.
 Prefer
# Nested module
to#Nested module
to improve readability  You can also insert some empty lines in some code blocks to improve readability
Returns True if ...
>Return True if ...
 You will also have to insert some
TESTS:
andEXAMPLES:
blocks to illustrate the behavior of the methods, in particular formodular_decomposition
, and for testing cases that could lead to errors.
comment:5 in reply to: ↑ 4 Changed 4 years ago by
Replying to dcoudert:
Dear Lokesh,
why have you removed file
modular_decomposition.pyx
? You should put it back for the moment. There are dependencies and so I'm unable to compile sage with your patch.
this should be trivial to fix. Let's remove it now for good.
comment:6 Changed 4 years ago by
 Branch changed from u/jlokesh/23487 to public/gsoc17_t23487
 Commit changed from 340580e66f166455ef36a7e2b5e713949397df1a to dd540d6c3e906ee96072fba95f79dc499c8fc4d1
this is the updates needed for Sage to build and run. modular_decomposition doctests in graph.py fail due to some format incompatibility of output of the new implementation. I'll leave it to Lokesh to fix, as well as the David's comments, naturally.
New commits:
dd540d6  updates to allow sage build. doctests in graph.py still fail

comment:7 Changed 4 years ago by
Sorry for the late reply. I have started working on the updates. I will push the commit shortly.
comment:8 Changed 4 years ago by
 Commit changed from dd540d6c3e906ee96072fba95f79dc499c8fc4d1 to 83387cb05f0438f0cb572284dc6ae5e5f3189066
Branch pushed to git repo; I updated commit sha1. New commits:
83387cb  trac #23487: improved readability

comment:9 Changed 4 years ago by
I have added a commit for improved readability in the code. This commit addresses first four issues pointed out by David. I will soon add a commit for including examples, test cases and for handling failure of doctests.
comment:10 Changed 4 years ago by
 Commit changed from 83387cb05f0438f0cb572284dc6ae5e5f3189066 to 0a5a5045b7b2463995334fe02d0758b7cbef49c4
Branch pushed to git repo; I updated commit sha1. New commits:
0a5a504  trac #23487: added tests, examples + fixed doctests

comment:11 Changed 4 years ago by
I have added the commit for fixing the doctests and added the Tests and Examples for modular_decomposition function. Should I add the examples for other functions as well? Because most of the functions are called from modular_decomposition itself.
For the tests for modular_decomposition I have added the graph from Marc Tedder research paper and an example from the wikipedia modular decomposition page. Further I have added a series graph (Tetrahedral Graph).
comment:12 Changed 4 years ago by
 Status changed from new to needs_review
comment:13 followup: ↓ 14 Changed 4 years ago by
Is #13744 a duplicate of this?
comment:14 in reply to: ↑ 13 Changed 4 years ago by
Replying to jmantysalo:
Is #13744 a duplicate of this?
Yes, the latter can be closed as won't fix, with the fixing done here.
comment:15 Changed 4 years ago by
 Commit changed from 0a5a5045b7b2463995334fe02d0758b7cbef49c4 to 1e6c92fa444f34173ad98bb79412c3e84b22fba4
comment:16 Changed 4 years ago by
As this is to be included in 8.1, the branch should be based on the current beta  as I just did in 1e6c92f.
comment:17 Changed 4 years ago by
I will clean up the import statements from sage.*
. Specifically, there is a handy import_statements()
command in Sage:
sage: import_statements('Graph') from sage.graphs.graph import Graph
importing from sage.all
instead is much less efficient.
comment:18 Changed 4 years ago by
 Commit changed from 1e6c92fa444f34173ad98bb79412c3e84b22fba4 to 646507e244d9a4f16de213c3733160c5ebfdfb57
Branch pushed to git repo; I updated commit sha1. New commits:
646507e  fixed import and name/address in copyright

comment:19 followup: ↓ 21 Changed 4 years ago by
Is there a function to create the quotient graph with the vertices being the max. modules?
Please also make it clear which functions are a part of the main algorithm, and which ones are only used for testing.
Please also add more TEST
and EXAMPLE
; e.g. the functions to compute mu
certainly need tests.
In general, all the functions, except 1liners, will need tests and docs.
comment:20 followup: ↓ 22 Changed 4 years ago by
Some comments:
 in class
Queue
, methodget
could return aValueError
. Isn't it more appropriate to directly raise an error?  in
form_module
, variableflag
can be removed to usewhile True:
 in
either_connected_or_not_connected
, usegraph.has_edge(u, v)
instead ofv in graph.neighbors(u)
 method
number_subtree
could be an internal method ofrecursively_number_cocomponents
. Only used here andnumber_subtree
is the recursive part  in
assembly
:while not len(root[1]) == 1:
>while len(root[1]) != 1:
?  in
update_comp_num
, instead oftree
you could usechild
 in
get_vertex_in
: improve the description since the output block gives more details than the 1 line decription. Also, you don't need a recursive method here. You could rewrite it aswhile tree[0].node_type != NORMAL: tree = tree[1][0] return tree[1][0]
if left_index  1 >= 0:
>if left_index >= 1:
? in
is_component_connected
, you don't need to convert explicitlyneighbor
to set. Theisdisjoint
accepts alist
as input.  method
recurse_component
should be an internal method ofget_vertices
 in method
refine
:for e in graph.edges_incident(v):
>for u in graph.neighbor_iterator(v):
seems more appropriate. In fact, you can directly build the setx
asx = {u for u in graph.neighbor_iterator(v) if vertex_dist[u] != vertex_dist[v]}. You will then have to update the definition of
xin the inputs of
maximal_subtrees_with_leaves_in_x`  in
maximal_subtrees_with_leaves_in_x
, instead of usingroot[1][1][0]
androot[1][2][1]
, why not creating first nodea
, filling it, add it toroot[1]
, then creating nodeb
, filling it and adding it toroot[1]
?  Add one line description to methods
create_prime_node
, etc.  the test methods (
test_modular_decomposition
, etc.) should be move at the end of the file after a separator like#=====================...
The writing style in sagemath is to start methods with a 1 line description of the functionality, then a blank line, then a longer description of the functionality if needed. So
def recursively_number_cocomponents(tree, cocomp_num, by_type): """ Recursively number the nodes in the (co)components. If the tree node_type is same as by_type then cocomp_num is incremented before assigning to the subtree else entire tree is numbered by cocomp_num. INPUTS: """
It's a long patch to review...
comment:21 in reply to: ↑ 19 Changed 4 years ago by
Replying to dimpase:
Is there a function to create the quotient graph with the vertices being the max. modules?
Currently there is no function which creates a quotient graph with the vertices being the maximal modules. I can create such a function though?
Please also make it clear which functions are a part of the main algorithm, and which ones are only used for testing.
David has asked me to move the test functions at the end after a separator, it will solve this problem.
Please also add more
TEST
andEXAMPLE
; e.g. the functions to computemu
certainly need tests. In general, all the functions, except 1liners, will need tests and docs.
I will definitely add more TESTS and EXAMPLES. It might take some time as each function requires some arguments mostly a tree or a module. I will need to input that manually.
Thanks for fixing the import and copyright!
comment:22 in reply to: ↑ 20 Changed 4 years ago by
Replying to dcoudert:
Thanks for reviewing the patch! I will make the changes and add a commit soon.
comment:23 Changed 4 years ago by
 Commit changed from 646507e244d9a4f16de213c3733160c5ebfdfb57 to 433ee5eebd5bc4af33ac8ded53bf665c4072b510
Branch pushed to git repo; I updated commit sha1. New commits:
433ee5e  trac #23487: improved code + documentation

comment:24 followup: ↓ 25 Changed 4 years ago by
It is much better now. Some more comments.
 on the left hand side, you don't need to write
[a,b] = [1,2]
. You can directly writea,b = [1,2]
 You can permute variables without intermediate variable as
a,b = b,a
 in method
__str__
, you don't need variablestring
. You can directly return the result.
 in the EXAMPLES block of
modular_decomposition
, you must put::
at the end of the sub blocks instead of:
. This is for the html documentation.
 Some of the examples in the TESTS block of
modular_decomposition
could be in the EXAMPLES block. Also, can you add sub blocks with indication on the origin of the tested graphs ?
 Then, can you have tests for bad inputs in
modular_decomposition
, like directed graph, etc. i.e., the raise error stuff
 I have an identation issue with the
"""
before the code
 in
compute_mu_for_co_component
: You don't need the intermediate variablemu_for_co_component
.
 in
compute_mu_for_component
: why not usingrange(source_index1, 1, 1)
and returning the first found value ?
 method
recursive_print_md_tree
could be an internal method ofprint_md_tree
. Also, instead of usinglevel
, you could directly pass a string as parameter,recursive_print_md_tree(tree, level + " ")
. Last, useprint("{}{}".format(level,str(root[0])))
for Python 3 compatibility.
comment:25 in reply to: ↑ 24 ; followup: ↓ 27 Changed 4 years ago by
Replying to dcoudert:
 I have an identation issue with the
"""
before the code
I could not understand which """
you are talking about? Is it the one in the beginning? Is the indentation issue in the doc build?
 in
compute_mu_for_component
: why not usingrange(source_index1, 1, 1)
and returning the first found value ?
if mu_for_component == root[1][index] and \
Suppose [c01, c02, c03, source, c1, c2, c3] is the forest. If mu value for c2 is c03 then c01 and c02 must be connected to c2. Therefore the iteration from 0 is required.
I have incorporated the other changes and will add a commit for it.
comment:26 Changed 4 years ago by
 Commit changed from 433ee5eebd5bc4af33ac8ded53bf665c4072b510 to 72d87e83095dae35528a2f47c28ba74593bcf83d
Branch pushed to git repo; I updated commit sha1. New commits:
72d87e8  improvements in the code and documentation

comment:27 in reply to: ↑ 25 ; followup: ↓ 29 Changed 4 years ago by
Replying to jlokesh:
Replying to dcoudert:
 I have an identation issue with the
"""
before the codeI could not understand which
"""
you are talking about? Is it the one in the beginning? Is the indentation issue in the doc build?
The one before if graph._directed
.
Note that it is better to use if graph.is_directed()
;)
comment:28 Changed 4 years ago by
When you add a link to a wikipedia page, the nice way to do it is: :wikipedia:`Modular_decomposition`
.
comment:29 in reply to: ↑ 27 Changed 4 years ago by
Replying to dcoudert:
The one before
if graph._directed
. Note that it is better to useif graph.is_directed()
;)
Found it. :) I will include these changes in the next commit along with TESTS and EXAMPLES for other functions.
comment:30 Changed 4 years ago by
 Commit changed from 72d87e83095dae35528a2f47c28ba74593bcf83d to 6f7af1e34eefdf6c38d319d7685682a422ebd726
Branch pushed to git repo; I updated commit sha1. New commits:
6f7af1e  trac #23487: added TESTS and EXAMPLES

comment:31 Changed 4 years ago by
Out of curiosity, have you considered using collections.dequeue
instead of making your own (inefficient) Queue
class?
comment:32 Changed 4 years ago by
I was not aware of this data structure. It's certainly what we need here.
comment:33 Changed 4 years ago by
 Commit changed from 6f7af1e34eefdf6c38d319d7685682a422ebd726 to ab14394a4e3a83c73936093e572d3bd1f8e9bef3
comment:34 Changed 4 years ago by
 Commit changed from ab14394a4e3a83c73936093e572d3bd1f8e9bef3 to caedc653b738b22972566f006a4a2ddf1c72c100
Branch pushed to git repo; I updated commit sha1. New commits:
caedc65  trac #23487: Used colloection.deque instead of Queue

comment:35 Changed 4 years ago by
 Commit changed from caedc653b738b22972566f006a4a2ddf1c72c100 to 12f3f1588cc637ce870458c1393563f4458bb043
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
12f3f15  trac #23487: Used collections.deque instead of Queue

comment:36 followup: ↓ 37 Changed 4 years ago by
@Dima: in the output of modular_decomposition
we have PRIME
, PARALLEL
, SERIES
, etc. wouldn't be better to have strings as before ?
@Lokesh: how difficult would it be to build the graph of the modular decomposition? I mean a graph in which for instance would keep a single vertex instead of twins.
comment:37 in reply to: ↑ 36 ; followup: ↓ 39 Changed 4 years ago by
Replying to dcoudert:
@Dima: in the output of
modular_decomposition
we havePRIME
,PARALLEL
,SERIES
, etc. wouldn't be better to have strings as before ?
It's a good idea to document what these PRIME, etc., stand for from Python point of view. Are they some kind of constants?
@Lokesh: how difficult would it be to build the graph of the modular decomposition? I mean a graph in which for instance would keep a single vertex instead of twins.
IMHO it's a question of constructing the induced subgraph, with vertices taken to be a transversal of the corresponding modules. (Thus it's easy, but should be provided, perhaps on a followup ticket).
comment:38 followup: ↓ 40 Changed 4 years ago by
Another remark about the implementation. I find it strange that you create a NodeInfo
class but don't go all the way creating a structure for your modular decomposition.
I would just add a member NodeInfo.children
beeing a list of NodeInfo
s and rename this NodeInfo
into something catchier.
This would e.g. change
for tree in root[1]: if tree[0].node_type == NORMAL and tree[1][0] in vertex_status ...
into the more expressive
for node in root.children: if node.node_type == NORMAL and node.children[0] in vertex_status ...
comment:39 in reply to: ↑ 37 Changed 4 years ago by
Replying to dimpase:
It's a good idea to document what these PRIME, etc., stand for from Python point of view. Are they some kind of constants?
Yes, these are constants. I have documented them in the NodeInfo
class under the data attribute node_type
where they are mostly used. I can even convert them into an enum which can have proper documentation?
comment:40 in reply to: ↑ 38 ; followup: ↓ 41 Changed 4 years ago by
Replying to ylchapuy:
I agree with you. It makes much more sense like that. If Dima and David agree I can work on that as well? It might take a lot of time as it requires a lot of changes.
comment:41 in reply to: ↑ 40 Changed 4 years ago by
comment:42 followup: ↓ 43 Changed 4 years ago by
Can you check the result for a graph with a single vertex.
sage: print G.modular_decomposition() () sage: G = Graph(2) sage: print G.modular_decomposition() (PARALLEL, [0, 1]) sage: G = Graph(1) sage: print G.modular_decomposition() 0
Also, unify "Series" / "SERIES", etc. in the documentation of the modular decomposition method in graph.py
.
comment:43 in reply to: ↑ 42 Changed 4 years ago by
Replying to dcoudert:
Can you check the result for a graph with a single vertex.
For a single vertex it returns the vertex itself.
Also, unify "Series" / "SERIES", etc. in the documentation of the modular decomposition method in
graph.py
.
I will handle it in the next commit.
comment:44 Changed 4 years ago by
 Commit changed from 12f3f1588cc637ce870458c1393563f4458bb043 to 7ebb0ab394f4089dc7152213948cc722230728a2
Branch pushed to git repo; I updated commit sha1. New commits:
7ebb0ab  trac #23487: Improved code structure

comment:45 Changed 4 years ago by
I have changed the NodeInfo? class to Node class in this commit and introduced a children field in this class. The constants like PARALLEL, LEFT_SPLIT... defined in the file have been added in an enumeration class. Further as pointed out by David the singleton should be treated as PRIME module. Therefore I have made the appropriate changes in the modular_decomposition and is_prime function of graph.py.
There are changes required in the Algorithm and References section of the modular_decomposition function of the graph.py. I will make those along with the revisions.
comment:46 Changed 4 years ago by
That's much better now.
One question: how difficult is it to build a graph from the modular decomposition ? In such graph, twins are merged into a single node with as label the Set
of merged nodes (see e.g., the result of treewidth).
There is no need to do it in this ticket. It could be postpone to a future ticket.
comment:47 Changed 3 years ago by
I am not able to find any resource on building graphs from modular decomposition. I am assuming we are trying to build a graph given a modular decomposition. Please correct me if I am wrong. If so I have a doubt about it being possible. If we consider this example:
g = Graph(5) g.add_edge(0,1) g.add_edge(1,2) g.add_edge(2,3) g.add_edge(4,3) g.add_edge(1,3) g.modular_decomposition()
(PRIME, [1, 0, 2, 3, 4])
and if we remove the edge (1,3) from it and then find the modular decomposition. Then also the result is same.
comment:48 Changed 3 years ago by
You are right, a proper definition is needed and I don't know any. So nothing to do.
comment:49 Changed 3 years ago by
I thin you can define a graph on the immediate descendants of the module of all the vertices; these are the proper maximal strong modules of the graph. As as example from Wikipedia, see
here the original graph is at top left corner, and the graph I am talking about is shown on the right from the root note [1,...,11] of the modular decomposition tree at the bottom left part.
comment:50 followup: ↓ 51 Changed 3 years ago by
Parallel module is easy to build into a graph. Series and Prime module are formed if the induced subgraph is connected but they can not tell how this graph is connected or what edges are there and what are not. As illustrated by example in comment 47.
comment:51 in reply to: ↑ 50 Changed 3 years ago by
Replying to jlokesh:
Parallel module is easy to build into a graph. Series and Prime module are formed if the induced subgraph is connected but they can not tell how this graph is connected or what edges are there and what are not. As illustrated by example in comment 47.
It is irrelevant what happens inside each module, for the vertices of each module used to construct the new graph will be merged into one. What's important is how modules are connected among each other. In the example in comment 47 the graph in question will have just one vertex (or perhaps it should be empty graph, but that's a very minor issue).
comment:52 followup: ↓ 54 Changed 3 years ago by
to clarify: I am talking about the quotient graph I started talking in comment 19, and you mentioned in comment 21.
comment:53 followup: ↓ 55 Changed 3 years ago by
In graph.py
:
if self.order() == 0 or self.order() == 1:
>if self.order() <= 1:
In modular_decomposition.py
:
 Why are you defining the string representation of class
NodeType
? I think it is not used andself.name
is not assigned here. And why not for classNodeSplit
.
 I the test
Graph from wikipedia link :wikipedia:`Modular_decomposition`::`
, you must changesage: g = Graph(d)
tosage: g = Graph(d2)
.
comment:54 in reply to: ↑ 52 Changed 3 years ago by
Replying to dimpase:
to clarify: I am talking about the quotient graph I started talking in comment 19, and you mentioned in comment 21.
That is not a problem then. Only thing to discuss would be the structure of quotient graph. Should the children have a pointer to quotient graphs they represent(the maximal modules). In such a case the structure would be same as the one on the top right portion of the image you shared.
comment:55 in reply to: ↑ 53 Changed 3 years ago by
Replying to dcoudert:
 Why are you defining the string representation of class
NodeType
? I think it is not used andself.name
is not assigned here. And why not for classNodeSplit
.
It is used in graph.py for printing the module type.
relabel = lambda x : (x.node_type, [relabel(_) for _ in x.children]) if x.node_type != NodeType.NORMAL else id_label[x.children[0]]
Here x.node_type is an object of NodeType? class. For NodeType?.PARALLEL it would be PARALLEL.
comment:56 Changed 3 years ago by
 Commit changed from 7ebb0ab394f4089dc7152213948cc722230728a2 to eeed64cf53f51604997673c0a6ce4bf79b85e0c0
Branch pushed to git repo; I updated commit sha1. New commits:
eeed64c  documentation changes + small fixes

comment:57 Changed 3 years ago by
I have made some changes to the documentation of modular_decomposition function in graph.py. Please review it.
comment:58 Changed 3 years ago by
 Commit changed from eeed64cf53f51604997673c0a6ce4bf79b85e0c0 to c9af7d6af605bbbc6418bc2f96a7e62f709d1849
comment:59 followup: ↓ 62 Changed 3 years ago by
You have removed the reference HabPau10
. It should stay, as it is a recent
survey paper on the topic, in fact, newer than TedCorHabPaul08
.
Also, note that this reference is still used in the text, and as a result docs don't build any more.
Perhaps it's time that you should check docs for building, before pushing in changes to them :)
Anyhow, I'm putting the reference back.
New commits:
32ff8db  Merge branch 'public/gsoc17_t23487' of trac.sagemath.org:sage into modde

c9af7d6  putting HabPau10 back in; docs build now

New commits:
32ff8db  Merge branch 'public/gsoc17_t23487' of trac.sagemath.org:sage into modde

c9af7d6  putting HabPau10 back in; docs build now

comment:60 Changed 3 years ago by
 Commit changed from c9af7d6af605bbbc6418bc2f96a7e62f709d1849 to fd26fff0011877cef68efbaa2db711375fa8b673
Branch pushed to git repo; I updated commit sha1. New commits:
fd26fff  correct docstrings syntax

comment:61 Changed 3 years ago by
as you can see from patchbot's output, about 20 functions miss doctests.
comment:62 in reply to: ↑ 59 Changed 3 years ago by
Replying to dimpase:
Perhaps it's time that you should check docs for building, before pushing in changes to them :)
Thanks for fixing it. Will check it next time :)
comment:63 Changed 3 years ago by
So, how about adding missing tests? It's basically just adding temporary print() statements into functions with missing doctests, recoding the results of printing for an example run, and converting these into doctests.
If you are too busy now, I can do this, I don't want to sit on this ticket for too long...
comment:64 Changed 3 years ago by
 Commit changed from fd26fff0011877cef68efbaa2db711375fa8b673 to 6688baca23cc0a066810180dd10acd8b5539dd8e
Branch pushed to git repo; I updated commit sha1. New commits:
6688bac  trac #23487: added examples

comment:65 Changed 3 years ago by
 Status changed from needs_review to positive_review
I missed that update  only comments are emailed to me, no commit notifications.
Looks good to me.
comment:66 Changed 3 years ago by
 Branch changed from public/gsoc17_t23487 to 6688baca23cc0a066810180dd10acd8b5539dd8e
 Resolution set to fixed
 Status changed from positive_review to closed
comment:67 Changed 3 years ago by
 Commit 6688baca23cc0a066810180dd10acd8b5539dd8e deleted
 Reviewers changed from David Coudert, Dmitrii Pasechnik to David Coudert, Dima Pasechnik
It will use the implementation from https://github.com/lokeshj1703/UndirectedModularDecomposition