Opened 8 years ago
Closed 7 years ago
#13917 closed enhancement (fixed)
IndependentSets class
Reported by: | ncohen | Owned by: | jason, ncohen, rlm |
---|---|---|---|
Priority: | major | Milestone: | sage-6.2 |
Component: | graph theory | Keywords: | |
Cc: | vdelecroix | Merged in: | |
Authors: | Nathann Cohen | Reviewers: | Vincent Delecroix |
Report Upstream: | N/A | Work issues: | |
Branch: | 635c88d (Commits) | Commit: | 635c88da8c6df73b9fdf94d95c0c4fa2fc7397b7 |
Dependencies: | #14589 | Stopgaps: |
Description (last modified by )
I'm rather disappointed by this patch. Or I'm pleasantly surprised by Python, I don't know. I expected a much better speedup.
sage: g = graphs.RandomGNP(50,.7) sage: import networkx sage: gn = g.networkx_graph() sage: %timeit list(networkx.find_cliques(gn)) 10 loops, best of 3: 48.8 ms per loop sage: from sage.graphs.independent_sets import IndependentSets sage: %timeit list(IndependentSets(g,complement=True, maximal = True)) 10 loops, best of 3: 20.8 ms per loop sage: %timeit IndependentSets(g,complement=True, maximal = True).cardinality() 100 loops, best of 3: 19 ms per loop
Either way, this patch implements a sage.graphs.independent_set
module that can list/count (possibly maximal) independent sets. While right now we can only compute the maximal ones.
Nathann
Attachments (4)
Change History (37)
comment:1 Changed 8 years ago by
- Status changed from new to needs_review
comment:2 Changed 8 years ago by
comment:3 Changed 8 years ago by
- Reviewers set to ahmorales
Changed 8 years ago by
comment:4 Changed 8 years ago by
- Cc vdelecroix added
comment:5 Changed 8 years ago by
Hi,
People from statistical physics are interested in finer quantities than the number of independent configurations (independent sets are related to particule interactions). In their case each configuration is counted with a weight of the form $\prod_{v \in I} \lambda(v)$
where $\lambda: V \rightarrow \mathbb{R}_+$
is a function on the vertices. Considering the case of $\lambda$ being constant, the code could even return the polynomial $f(\lambda)$
whose coefficient for $\lambda^k$
is the number of independent set with $k$
vertices (or equivalently the list of number of independent set of given size). Perhaps this polynomial has a name in graph theory ?
I let the serious comments to the reviewer.
Vincent
comment:6 follow-up: ↓ 7 Changed 8 years ago by
- Description modified (diff)
- Status changed from needs_review to needs_work
- Work issues set to design, documentation
0) I found your patch useful!
0') I thought that in your timing most of the time was lost in the creation of the list and the for loop... but no: it is not much faster to enumerate all independent set rather than counting them with the method cardinality (I update the example in the description). So, I am as well surprised by Python/Cython
.
Todo
1) I am not sure the following is what we want
sage: I = independent_sets.IndependentSets(graphs.PetersenGraph()) sage: i1 = iter(I); sage: i2 = iter(I) sage: i1.next() [0] sage: i2.next() [0, 2] sage: i1.next() [0, 2, 6] sage: i2.next() [0, 2, 8]
In other words an iterator is an iterator and nothing else. If you want to design the set of independent sets it should be an object different from the iterator. It would be better to have a Python class IndependentSets
desinged to modelize the set of independent sets (it should inherit from Parent
:P) and a Cython iterator IndependentSetIterator
. What do you think?
2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?
3) The flag count_only allows to count the number of independent set in only one call to __next__
(if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch the StopIteration
.
Possible improvements
1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.
Documentation
1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.
2) wikipedia links might be useful :wikipedia:Independent_set_(graph_theory), :wikipedia:Dominating_set, :wikipedia:Clique_graph
comment:7 in reply to: ↑ 6 ; follow-up: ↓ 8 Changed 8 years ago by
Yoooooooooooo !!
0) I found your patch useful!
Cool !
0') I thought that in your timing most of the time was lost in the creation of the list and the for loop... but no: it is not much faster to enumerate all independent set rather than counting them with the method cardinality (I update the example in the description). So, I am as well surprised by
Python/Cython
.
Yep O_o
1) I am not sure the following is what we want
Ok. So what you want is an empty class that does nothing, which calls another class that does all the job ? It cannot be just an independent iterator, for I also want to compute the number of cliques without having to compute all the lists of elements at Python level.
In other words an iterator is an iterator and nothing else.
Says the dogma.
If you want to design the set of independent sets it should be an object different from the iterator.
Says the dogma.
It would be better to have a Python class
IndependentSets
desinged to modelize the set of independent sets
And what on earth would it do that is not a feature of the iterator ?
(it should inherit from
Parent
:P)
Over my dead body.
and a Cython iterator
IndependentSetIterator
. What do you think?
I will not create an empty class if I can avoid it.
2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?
Actually I avoided it on purpose. I felt like this thing was not interesting for everybody, and that it should belong to a module one would load when needed, rather than add a new function to Graph.... What do you think ? It's not like one wants to enumerate independent sets every other day.
3) The flag count_only allows to count the number of independent set in only one call to
__next__
(if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch theStopIteration
.
I will rewrite this part using $14589 as a dependency. It's a bit stupid to deal with <64 vertices graphs only.
1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.
And it already is :
n_sets = [0]*g.order() for s in IndependentSets(g): n_sets[len(s)] += 1
What do you want ? Ugly "if" everywhere in the main loop to filter this before returning them ?...
1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.
The complement of an independent set is a vertex cover (and very often a dominating set too, unless the graph has isolated vertices), but an independent set of the <complement of the graph> is a clique. Was that the problem ?
2) wikipedia links might be useful :wikipedia:Independent_set_(graph_theory), :wikipedia:Dominating_set, :wikipedia:Clique_graph
Right, right. I will add them to the next version of this patch.
Nathann
comment:8 in reply to: ↑ 7 Changed 8 years ago by
Replying to ncohen:
Hey!
(it should inherit from
Parent
:P)Over my dead body.
(-: you should love your parents :-)
and a Cython iterator
IndependentSetIterator
. What do you think?I will not create an empty class if I can avoid it.
And what about
sage: from sage.graphs.independent_sets import IndependentSets sage: I = IndependentSets(g,complement=True, maximal = True) sage: I.next() [0, 1, 2, 9, 19, 35, 40, 43, 47] sage: I.cardinality() 4457 sage: I.next() Traceback (most recent call last): ... StopIteration:
Your class is in between:
- an iterator
- an object that modelises the set of independent sets (the methods
cardinality
and__contains__
)
Doing both purposes at the same time implies many unsatisfactory behaviors. What I suggest is an iterator that iterates and a class that answers to __contains__
and cardinality
. But this is not satisfactory either because you still have to hack the iterator to get the cardinality.
It might be your choice to have a multiple purposes class. If so, specify it in the documentation as : "you can either use the class to iterate (though only once) or compute the cardinality, but do not think that you can do both in a clean way".
2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?
Actually I avoided it on purpose. I felt like this thing was not interesting for everybody, and that it should belong to a module one would load when needed, rather than add a new function to Graph.... What do you think ? It's not like one wants to enumerate independent sets every other day.
I agree!
3) The flag count_only allows to count the number of independent set in only one call to
__next__
(if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch theStopIteration
.I will rewrite this part using $14589 as a dependency. It's a bit stupid to deal with <64 vertices graphs only.
1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.
And it already is :
n_sets = [0]*g.order() for s in IndependentSets(g): n_sets[len(s)] += 1What do you want ? Ugly "if" everywhere in the main loop to filter this before returning them ?...
No. What I meant is to consider a more general attribute .count
of type int[64]
which contains the number of independent set of each cardinality. But your solution is satisfactory and you can keep the implementation as it is. Could you add your clever solution to the documentation?
1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.
The complement of an independent set is a vertex cover (and very often a dominating set too, unless the graph has isolated vertices), but an independent set of the <complement of the graph> is a clique. Was that the problem ?
My mistake.
comment:9 Changed 8 years ago by
(new patch that works with >64 vertices, based on top of 14589. And even slower than the first one >_<
)
Nathann
comment:10 Changed 8 years ago by
- Description modified (diff)
comment:11 Changed 8 years ago by
- Dependencies set to #14589
- Status changed from needs_work to needs_review
Newwwwwwww patch !!!!
- It now deals with graphs of arbitrary size (using the binary matrices from #14589)
- I spent quite some time complaining to myself (and to Florent) about creating an empty
IndependentSets
class which would call anIndependentSetsIterator
class. He also agreed with you Vincent, and even though I did not like this problem with double loops either, I still refused to write an empty class.
Because I hate empty classes.
Turns out that there is a nice way out, that I stupidly did not notice : instead of writing a
.next()
method, I moved the code to.__iter__()
, and replaced thereturn
byyield
. And many global variables are now local to.__iter__()
, and everything goes well under the sun:-P
sage: g = graphs.PetersenGraph() sage: from sage.graphs.independent_sets import IndependentSets sage: IS = IndependentSets(g) sage: IS2 = [(x,y) for x in IS for y in IS] sage: len(IS2) 5776 sage: IS.cardinality()**2 5776
- I added a comment about the difference in speed between this implementation and NetworkX
- Annnnnnd I added some links toward Wikipedia.
Tell me if you like it :-)
Nathann
Changed 8 years ago by
Changed 8 years ago by
comment:12 follow-up: ↓ 13 Changed 8 years ago by
- Status changed from needs_review to needs_work
- Work issues design, documentation deleted
You are a clever guy as you found a solution that satisfy both of us!
Some new comments
0) Nobody would access the documentation in __iter__
or __contains__
. I modified the documentation accordingly.
1) you should move bitset_are_disjoint
to sage.misc.bitset
as somebody else may want to use it.
2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.
3)
sage: G = graphs.CubeGraph(5) sage: I = IndependentSets(G) sage: [0,1,2,3] in I BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!!
4)
sage: IndependentSets(graphs.EmptyGraph()) REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!!
comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 8 years ago by
- Status changed from needs_work to needs_info
Hellooooooooo !!
0) Nobody would access the documentation in
__iter__
or__contains__
. I modified the documentation accordingly.
Oh. Right.
1) you should move
bitset_are_disjoint
tosage.misc.bitset
as somebody else may want to use it.
Hmmmm... I did not move it there for the function does not take bitsets as input, but unsigned long *
pointers... So I felt that it did not belong there. If you are convinced that it should be there, no problem and I will move it. I really had no idea.
2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.
Hmmm.. -_-
Right.
Do you want me to add options to list independent sets with a special filter according to the number of vertices ?
3)
sage: G = graphs.CubeGraph(5) sage: I = IndependentSets(G) sage: [0,1,2,3] in I BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!!4)
sage: IndependentSets(graphs.EmptyGraph()) REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!!
Oh. I saw your ProgrammerError
in the patch, but there is no segfault on my side. When I run your commands, here is what I get :
sage: from sage.graphs.independent_sets import IndependentSets sage: from sage.graphs.independent_sets sage: from sage.graphs.independent_sets import IndependentSets sage: G = graphs.CubeGraph(5) sage: I = IndependentSets(G) sage: [0,1,2,3] in I --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-9-a8f48211b650> in <module>() ----> 1 [Integer(0),Integer(1),Integer(2),Integer(3)] in I /home/ncohen/.Sage/local/lib/python2.7/site-packages/sage/graphs/independent_sets.so in sage.graphs.independent_sets.IndependentSets.__contains__ (build/cythonized/sage/graphs/independent_sets.c:5633)() ValueError: An element of the set being tested does not belong to
Admittedly the error message has to be fixed, but no segfault. O_o
The empty graph produces a segfault allright. But of course the empty graph is not really a graph :-P
Tell me what you think, and I will write an additional patch. And thank you for your review !
Nathann
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 8 years ago by
Replying to ncohen:
1) you should move
bitset_are_disjoint
tosage.misc.bitset
as somebody else may want to use it.Hmmmm... I did not move it there for the function does not take bitsets as input, but
unsigned long *
pointers... So I felt that it did not belong there. If you are convinced that it should be there, no problem and I will move it. I really had no idea.
I see why now. I thought that a bitset was a "unsigned long *" but it is a struct that contains it. is nothing more than an "unsigned long *". I would prefer if you move your function and make it takes bitset_s (= bitset_t *) as input.
2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.
Hmmm..
-_-
Right.Do you want me to add options to list independent sets with a special filter according to the number of vertices ?
I see two solutions:
- You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
- add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)
3) [...] BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!! [...] REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!! [...]
Oh. I saw your
ProgrammerError
in the patch, but there is no segfault on my side. [...] Admittedly the error message has to be fixed, but no segfault.O_o
The empty graph produces a segfault allright. But of course the empty graph is not really a graph
:-P
Your computer is better than mine ;-) What sage version are you running ? I will compile the rc1 and see what I get (I was on 5.10.beta4).
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 8 years ago by
Yo !
I see why now. I thought that a bitset was a "unsigned long *" but it is a struct that contains it. is nothing more than an "unsigned long *". I would prefer if you move your function and make it takes bitset_s (= bitset_t *) as input.
I really need a function that takes unsigned long *
as input, because this is what the binary matrices store. Inside of a bitset structure, there is also a .bits
field that is of type unsigned long *
, but if I want to convert my unsigned long *
to a bitset, then I need either to allocate a new array or to mess with a bitset data structure before calling the new version of the function that would take bitsets as input.
And it looks like a waste of time.
I really need a function that takes unsigned long *
as input.
- You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
I can do it. I don't mind.
- add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)
I hope that this will not change the speed of the computations, though.
Your computer is better than mine ;-) What sage version are you running ?
Sage-5.10.rc0. But I don't see why mine would not see a segfault that occurs on yours. Usually, the one who gets the segfaults is right, and the one who does not get it is just lucky. There may be a problem somewhere.
I will compile the rc1 and see what I get (I was on 5.10.beta4).
Okayyyyyyy.
Still waiting for your answer before I write the additional patch.
Nathann
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 8 years ago by
Replying to ncohen:
A stupid comment I was doing myself:
The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...
- You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
I can do it. I don't mind.
- add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)
I hope that this will not change the speed of the computations, though.
If you have the time to try, please do. I can not imagine that you will see a difference with two test conditions. Otherwise, the first solution is ok.
Your computer is better than mine ;-) What sage version are you running ?
Sage-5.10.rc0. But I don't see why mine would not see a segfault that occurs on yours. Usually, the one who gets the segfaults is right, and the one who does not get it is just lucky. There may be a problem somewhere.
I will compile the rc1 and see what I get (I was on 5.10.beta4).
on rc1 it works fine...
comment:17 in reply to: ↑ 16 ; follow-up: ↓ 19 Changed 8 years ago by
- Status changed from needs_info to needs_review
Hellooooooooooo !!
The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...
Hmmmmm... I don't get it O_o
If you do nothing but talk of the exploration tree then I do not see how you can beat this implementation : it goes through this tree and only remembers the name of the current node... I don't see it getting much cheaper O_o
I can do it. I don't mind.
Hmmm... Turns out that when looking at the code again, it required to store a variable containing the current number of elements in the set, and add +=1 and -=1 in several places... Soooooo if you don't need it I would prefer not to implement this here. I added a message saying that the iterator trick is not computationally smart.
on rc1 it works fine...
I fixed the segfault on empty graphs, and they are now supported too by this class. This way nothing bad happens when it is called from Graph.cliques_maximal.
Patch updated ! ;-)
Nathann
Changed 8 years ago by
comment:18 Changed 8 years ago by
- Description modified (diff)
comment:19 in reply to: ↑ 17 Changed 8 years ago by
Replying to ncohen:
Hellooooooooooo !!
The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...
Hmmmmm... I don't get it
O_o
If you do nothing but talk of the exploration tree then I do not see how you can beat this implementation : it goes through this tree and only remembers the name of the current node... I don't see it getting much cheaper
O_o
I just read Cazals Karande 2008 and in the algorithm they expose: they deal with smaller and smaller graphs at each step and this is what makes the networkx implementation faster. It starts from the stupid remark: I have a candidate u to add to my ind. set, if I add it then I should remove its neighbors from the graph.
Basically, at each step, you store three sets: R (current indep. set), P (vertices away from R), X (vertices away from R but yet used) and you assume that P u X are all neighbors of R. Then for each vertex in P you add it to R, then update P and R, explore from there and then add the vertex to X. When X is empty, then you get an independent set, when P and X are empty you get a maximal independent set.
I can do it. I don't mind.
Hmmm... Turns out that when looking at the code again, it required to store a variable containing the current number of elements in the set, and add +=1 and -=1 in several places... Soooooo if you don't need it I would prefer not to implement this here. I added a message saying that the iterator trick is not computationally smart.
Ok.
Patch updated !
;-)
Ok
comment:20 Changed 7 years ago by
Ping ? :-P
comment:21 Changed 7 years ago by
- Milestone changed from sage-5.11 to sage-5.12
comment:22 Changed 7 years ago by
- Description modified (diff)
comment:23 Changed 7 years ago by
- Milestone changed from sage-6.1 to sage-6.2
comment:24 Changed 7 years ago by
- Branch set to u/ncohen/13917
- Commit set to 1be18a591f289740796242cfe07d7736801301d8
- Description modified (diff)
comment:25 Changed 7 years ago by
- Commit changed from 1be18a591f289740796242cfe07d7736801301d8 to 2e15c7d32b035a865de6ab45b30dc3c87cb19103
Branch pushed to git repo; I updated commit sha1. New commits:
2e15c7d | trac #13917: A couple of mistakes
|
comment:26 Changed 7 years ago by
- Reviewers changed from ahmorales to ahmorales, Vincent Delecroix
- I do not like line 168 of independent_sets.py and expecially
G.__class__(n)
It looks offuscated to me... I replaced it withGraph(n)
. I am not sure this is the same so please double check. - I changed the 0 to a 4 in line 114.
- Regarding comment:6 of me (Possible improvements), note that comment:2 of ahmorales advocates for having the iterator for independent sets of fixed size ! So I added it into a TODO section.
See u/vdelecroix/13917
and this is good to go (after your check of point 1 above).
comment:27 Changed 7 years ago by
Helloooooooooooooooooooooo !!!
I think that this 0 was meant to be a 10 or something. Anyway thank you for changing it.
About G.__class__
: no idea why it was written this way. The only point of this is to create a Graph
when G is a Graph
and a DiGraph
when G is a DiGraph
, but really, there, it served no purpose at all.
About the TODO section. HMmmm... I don't like this much, because to me it is not a "todo" at all. It is just ideas for improvement. It is not something that must be done, it is just ideas.... "Possible improvements" would fit better I think. Well. 'tsup to you :-)
I will upload a small commit in a second, because the second check_matching
which is defined in the code is not about matchings at all. Annnnd then you can decide what this patch becomes.
Have fuuuuuuuun and thank you again :-)
Nathann
comment:28 Changed 7 years ago by
- Commit changed from 2e15c7d32b035a865de6ab45b30dc3c87cb19103 to 5aeaccffe6193a42f7a06399198f043fb9ab8302
comment:29 Changed 7 years ago by
- Branch changed from u/ncohen/13917 to u/vdelecroix/13917
- Commit changed from 5aeaccffe6193a42f7a06399198f043fb9ab8302 to 09052a0520aa205aaa7af9d0e60c8e1c5574bd1b
- Reviewers changed from ahmorales, Vincent Delecroix to Vincent Delecroix
- Status changed from needs_review to positive_review
In my last commit, I removed the TODO and put the comment in the NOTE.
New commits:
09052a0 | remove the TODO and put the comment in NOTE
|
comment:30 Changed 7 years ago by
Wouhouuuuuu !! :-)
Nathann
comment:31 Changed 7 years ago by
Doctest failure
sage -t --long src/sage/graphs/graph.py ********************************************************************** File "src/sage/graphs/graph.py", line 4910, in sage.graphs.graph.Graph.cliques_maximal Failed example: C.cliques_maximal() Expected: [[0, 4], [4, 1, 2, 3]] Got: [[0, 4], [1, 2, 3, 4]] **********************************************************************
comment:32 Changed 7 years ago by
- Branch changed from u/vdelecroix/13917 to public/13917
- Commit changed from 09052a0520aa205aaa7af9d0e60c8e1c5574bd1b to 635c88da8c6df73b9fdf94d95c0c4fa2fc7397b7
New commits:
635c88d | trac #13917: Broken doctest
|
comment:33 Changed 7 years ago by
- Branch changed from public/13917 to 635c88da8c6df73b9fdf94d95c0c4fa2fc7397b7
- Resolution set to fixed
- Status changed from positive_review to closed
I think this patch is useful. I was looking for something to count independents sets of a given size since they index certain triangulations of polytopes coming from flows on graphs (#14654).