Opened 8 years ago
Last modified 7 years ago
#13917 closed enhancement
IndependentSets class — at Version 18
Reported by: | ncohen | Owned by: | jason, ncohen, rlm |
---|---|---|---|
Priority: | major | Milestone: | sage-6.2 |
Component: | graph theory | Keywords: | |
Cc: | vdelecroix | Merged in: | |
Authors: | Nathann Cohen | Reviewers: | ahmorales |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #14589 | Stopgaps: |
Description (last modified by )
I'm rather disappointed by this patch. Or I'm pleasantly surprised by Python, I don't know. I expected a much better speedup.
sage: g = graphs.RandomGNP(50,.7) sage: import networkx sage: gn = g.networkx_graph() sage: %timeit list(networkx.find_cliques(gn)) 10 loops, best of 3: 48.8 ms per loop sage: from sage.graphs.independent_sets import IndependentSets sage: %timeit list(IndependentSets(g,complement=True, maximal = True)) 10 loops, best of 3: 20.8 ms per loop sage: %timeit IndependentSets(g,complement=True, maximal = True).cardinality() 100 loops, best of 3: 19 ms per loop
Either way, this patch implements a sage.graphs.independent_set
module that can list/count (possibly maximal) independent sets.
Anyway, don't hesitate to tell me that this patch is useless if you feel like it. I'm not *that* convinced either.
Nathann
Apply:
Change History (22)
comment:1 Changed 8 years ago by
- Status changed from new to needs_review
comment:2 Changed 8 years ago by
comment:3 Changed 8 years ago by
- Reviewers set to ahmorales
Changed 8 years ago by
comment:4 Changed 8 years ago by
- Cc vdelecroix added
comment:5 Changed 8 years ago by
Hi,
People from statistical physics are interested in finer quantities than the number of independent configurations (independent sets are related to particule interactions). In their case each configuration is counted with a weight of the form $\prod_{v \in I} \lambda(v)$
where $\lambda: V \rightarrow \mathbb{R}_+$
is a function on the vertices. Considering the case of $\lambda$ being constant, the code could even return the polynomial $f(\lambda)$
whose coefficient for $\lambda^k$
is the number of independent set with $k$
vertices (or equivalently the list of number of independent set of given size). Perhaps this polynomial has a name in graph theory ?
I let the serious comments to the reviewer.
Vincent
comment:6 follow-up: ↓ 7 Changed 8 years ago by
- Description modified (diff)
- Status changed from needs_review to needs_work
- Work issues set to design, documentation
0) I found your patch useful!
0') I thought that in your timing most of the time was lost in the creation of the list and the for loop... but no: it is not much faster to enumerate all independent set rather than counting them with the method cardinality (I update the example in the description). So, I am as well surprised by Python/Cython
.
Todo
1) I am not sure the following is what we want
sage: I = independent_sets.IndependentSets(graphs.PetersenGraph()) sage: i1 = iter(I); sage: i2 = iter(I) sage: i1.next() [0] sage: i2.next() [0, 2] sage: i1.next() [0, 2, 6] sage: i2.next() [0, 2, 8]
In other words an iterator is an iterator and nothing else. If you want to design the set of independent sets it should be an object different from the iterator. It would be better to have a Python class IndependentSets
desinged to modelize the set of independent sets (it should inherit from Parent
:P) and a Cython iterator IndependentSetIterator
. What do you think?
2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?
3) The flag count_only allows to count the number of independent set in only one call to __next__
(if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch the StopIteration
.
Possible improvements
1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.
Documentation
1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.
2) wikipedia links might be useful :wikipedia:Independent_set_(graph_theory), :wikipedia:Dominating_set, :wikipedia:Clique_graph
comment:7 in reply to: ↑ 6 ; follow-up: ↓ 8 Changed 8 years ago by
Yoooooooooooo !!
0) I found your patch useful!
Cool !
0') I thought that in your timing most of the time was lost in the creation of the list and the for loop... but no: it is not much faster to enumerate all independent set rather than counting them with the method cardinality (I update the example in the description). So, I am as well surprised by
Python/Cython
.
Yep O_o
1) I am not sure the following is what we want
Ok. So what you want is an empty class that does nothing, which calls another class that does all the job ? It cannot be just an independent iterator, for I also want to compute the number of cliques without having to compute all the lists of elements at Python level.
In other words an iterator is an iterator and nothing else.
Says the dogma.
If you want to design the set of independent sets it should be an object different from the iterator.
Says the dogma.
It would be better to have a Python class
IndependentSets
desinged to modelize the set of independent sets
And what on earth would it do that is not a feature of the iterator ?
(it should inherit from
Parent
:P)
Over my dead body.
and a Cython iterator
IndependentSetIterator
. What do you think?
I will not create an empty class if I can avoid it.
2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?
Actually I avoided it on purpose. I felt like this thing was not interesting for everybody, and that it should belong to a module one would load when needed, rather than add a new function to Graph.... What do you think ? It's not like one wants to enumerate independent sets every other day.
3) The flag count_only allows to count the number of independent set in only one call to
__next__
(if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch theStopIteration
.
I will rewrite this part using $14589 as a dependency. It's a bit stupid to deal with <64 vertices graphs only.
1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.
And it already is :
n_sets = [0]*g.order() for s in IndependentSets(g): n_sets[len(s)] += 1
What do you want ? Ugly "if" everywhere in the main loop to filter this before returning them ?...
1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.
The complement of an independent set is a vertex cover (and very often a dominating set too, unless the graph has isolated vertices), but an independent set of the <complement of the graph> is a clique. Was that the problem ?
2) wikipedia links might be useful :wikipedia:Independent_set_(graph_theory), :wikipedia:Dominating_set, :wikipedia:Clique_graph
Right, right. I will add them to the next version of this patch.
Nathann
comment:8 in reply to: ↑ 7 Changed 8 years ago by
Replying to ncohen:
Hey!
(it should inherit from
Parent
:P)Over my dead body.
(-: you should love your parents :-)
and a Cython iterator
IndependentSetIterator
. What do you think?I will not create an empty class if I can avoid it.
And what about
sage: from sage.graphs.independent_sets import IndependentSets sage: I = IndependentSets(g,complement=True, maximal = True) sage: I.next() [0, 1, 2, 9, 19, 35, 40, 43, 47] sage: I.cardinality() 4457 sage: I.next() Traceback (most recent call last): ... StopIteration:
Your class is in between:
- an iterator
- an object that modelises the set of independent sets (the methods
cardinality
and__contains__
)
Doing both purposes at the same time implies many unsatisfactory behaviors. What I suggest is an iterator that iterates and a class that answers to __contains__
and cardinality
. But this is not satisfactory either because you still have to hack the iterator to get the cardinality.
It might be your choice to have a multiple purposes class. If so, specify it in the documentation as : "you can either use the class to iterate (though only once) or compute the cardinality, but do not think that you can do both in a clean way".
2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?
Actually I avoided it on purpose. I felt like this thing was not interesting for everybody, and that it should belong to a module one would load when needed, rather than add a new function to Graph.... What do you think ? It's not like one wants to enumerate independent sets every other day.
I agree!
3) The flag count_only allows to count the number of independent set in only one call to
__next__
(if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch theStopIteration
.I will rewrite this part using $14589 as a dependency. It's a bit stupid to deal with <64 vertices graphs only.
1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.
And it already is :
n_sets = [0]*g.order() for s in IndependentSets(g): n_sets[len(s)] += 1What do you want ? Ugly "if" everywhere in the main loop to filter this before returning them ?...
No. What I meant is to consider a more general attribute .count
of type int[64]
which contains the number of independent set of each cardinality. But your solution is satisfactory and you can keep the implementation as it is. Could you add your clever solution to the documentation?
1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.
The complement of an independent set is a vertex cover (and very often a dominating set too, unless the graph has isolated vertices), but an independent set of the <complement of the graph> is a clique. Was that the problem ?
My mistake.
comment:9 Changed 8 years ago by
(new patch that works with >64 vertices, based on top of 14589. And even slower than the first one >_<
)
Nathann
comment:10 Changed 8 years ago by
- Description modified (diff)
comment:11 Changed 8 years ago by
- Dependencies set to #14589
- Status changed from needs_work to needs_review
Newwwwwwww patch !!!!
- It now deals with graphs of arbitrary size (using the binary matrices from #14589)
- I spent quite some time complaining to myself (and to Florent) about creating an empty
IndependentSets
class which would call anIndependentSetsIterator
class. He also agreed with you Vincent, and even though I did not like this problem with double loops either, I still refused to write an empty class.
Because I hate empty classes.
Turns out that there is a nice way out, that I stupidly did not notice : instead of writing a
.next()
method, I moved the code to.__iter__()
, and replaced thereturn
byyield
. And many global variables are now local to.__iter__()
, and everything goes well under the sun:-P
sage: g = graphs.PetersenGraph() sage: from sage.graphs.independent_sets import IndependentSets sage: IS = IndependentSets(g) sage: IS2 = [(x,y) for x in IS for y in IS] sage: len(IS2) 5776 sage: IS.cardinality()**2 5776
- I added a comment about the difference in speed between this implementation and NetworkX
- Annnnnnd I added some links toward Wikipedia.
Tell me if you like it :-)
Nathann
Changed 8 years ago by
Changed 8 years ago by
comment:12 follow-up: ↓ 13 Changed 8 years ago by
- Status changed from needs_review to needs_work
- Work issues design, documentation deleted
You are a clever guy as you found a solution that satisfy both of us!
Some new comments
0) Nobody would access the documentation in __iter__
or __contains__
. I modified the documentation accordingly.
1) you should move bitset_are_disjoint
to sage.misc.bitset
as somebody else may want to use it.
2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.
3)
sage: G = graphs.CubeGraph(5) sage: I = IndependentSets(G) sage: [0,1,2,3] in I BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!!
4)
sage: IndependentSets(graphs.EmptyGraph()) REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!!
comment:13 in reply to: ↑ 12 ; follow-up: ↓ 14 Changed 8 years ago by
- Status changed from needs_work to needs_info
Hellooooooooo !!
0) Nobody would access the documentation in
__iter__
or__contains__
. I modified the documentation accordingly.
Oh. Right.
1) you should move
bitset_are_disjoint
tosage.misc.bitset
as somebody else may want to use it.
Hmmmm... I did not move it there for the function does not take bitsets as input, but unsigned long *
pointers... So I felt that it did not belong there. If you are convinced that it should be there, no problem and I will move it. I really had no idea.
2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.
Hmmm.. -_-
Right.
Do you want me to add options to list independent sets with a special filter according to the number of vertices ?
3)
sage: G = graphs.CubeGraph(5) sage: I = IndependentSets(G) sage: [0,1,2,3] in I BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!!4)
sage: IndependentSets(graphs.EmptyGraph()) REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!!
Oh. I saw your ProgrammerError
in the patch, but there is no segfault on my side. When I run your commands, here is what I get :
sage: from sage.graphs.independent_sets import IndependentSets sage: from sage.graphs.independent_sets sage: from sage.graphs.independent_sets import IndependentSets sage: G = graphs.CubeGraph(5) sage: I = IndependentSets(G) sage: [0,1,2,3] in I --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-9-a8f48211b650> in <module>() ----> 1 [Integer(0),Integer(1),Integer(2),Integer(3)] in I /home/ncohen/.Sage/local/lib/python2.7/site-packages/sage/graphs/independent_sets.so in sage.graphs.independent_sets.IndependentSets.__contains__ (build/cythonized/sage/graphs/independent_sets.c:5633)() ValueError: An element of the set being tested does not belong to
Admittedly the error message has to be fixed, but no segfault. O_o
The empty graph produces a segfault allright. But of course the empty graph is not really a graph :-P
Tell me what you think, and I will write an additional patch. And thank you for your review !
Nathann
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 8 years ago by
Replying to ncohen:
1) you should move
bitset_are_disjoint
tosage.misc.bitset
as somebody else may want to use it.Hmmmm... I did not move it there for the function does not take bitsets as input, but
unsigned long *
pointers... So I felt that it did not belong there. If you are convinced that it should be there, no problem and I will move it. I really had no idea.
I see why now. I thought that a bitset was a "unsigned long *" but it is a struct that contains it. is nothing more than an "unsigned long *". I would prefer if you move your function and make it takes bitset_s (= bitset_t *) as input.
2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.
Hmmm..
-_-
Right.Do you want me to add options to list independent sets with a special filter according to the number of vertices ?
I see two solutions:
- You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
- add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)
3) [...] BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!! [...] REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!! [...]
Oh. I saw your
ProgrammerError
in the patch, but there is no segfault on my side. [...] Admittedly the error message has to be fixed, but no segfault.O_o
The empty graph produces a segfault allright. But of course the empty graph is not really a graph
:-P
Your computer is better than mine ;-) What sage version are you running ? I will compile the rc1 and see what I get (I was on 5.10.beta4).
comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 8 years ago by
Yo !
I see why now. I thought that a bitset was a "unsigned long *" but it is a struct that contains it. is nothing more than an "unsigned long *". I would prefer if you move your function and make it takes bitset_s (= bitset_t *) as input.
I really need a function that takes unsigned long *
as input, because this is what the binary matrices store. Inside of a bitset structure, there is also a .bits
field that is of type unsigned long *
, but if I want to convert my unsigned long *
to a bitset, then I need either to allocate a new array or to mess with a bitset data structure before calling the new version of the function that would take bitsets as input.
And it looks like a waste of time.
I really need a function that takes unsigned long *
as input.
- You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
I can do it. I don't mind.
- add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)
I hope that this will not change the speed of the computations, though.
Your computer is better than mine ;-) What sage version are you running ?
Sage-5.10.rc0. But I don't see why mine would not see a segfault that occurs on yours. Usually, the one who gets the segfaults is right, and the one who does not get it is just lucky. There may be a problem somewhere.
I will compile the rc1 and see what I get (I was on 5.10.beta4).
Okayyyyyyy.
Still waiting for your answer before I write the additional patch.
Nathann
comment:16 in reply to: ↑ 15 ; follow-up: ↓ 17 Changed 8 years ago by
Replying to ncohen:
A stupid comment I was doing myself:
The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...
- You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
I can do it. I don't mind.
- add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)
I hope that this will not change the speed of the computations, though.
If you have the time to try, please do. I can not imagine that you will see a difference with two test conditions. Otherwise, the first solution is ok.
Your computer is better than mine ;-) What sage version are you running ?
Sage-5.10.rc0. But I don't see why mine would not see a segfault that occurs on yours. Usually, the one who gets the segfaults is right, and the one who does not get it is just lucky. There may be a problem somewhere.
I will compile the rc1 and see what I get (I was on 5.10.beta4).
on rc1 it works fine...
comment:17 in reply to: ↑ 16 Changed 8 years ago by
- Status changed from needs_info to needs_review
Hellooooooooooo !!
The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...
Hmmmmm... I don't get it O_o
If you do nothing but talk of the exploration tree then I do not see how you can beat this implementation : it goes through this tree and only remembers the name of the current node... I don't see it getting much cheaper O_o
I can do it. I don't mind.
Hmmm... Turns out that when looking at the code again, it required to store a variable containing the current number of elements in the set, and add +=1 and -=1 in several places... Soooooo if you don't need it I would prefer not to implement this here. I added a message saying that the iterator trick is not computationally smart.
on rc1 it works fine...
I fixed the segfault on empty graphs, and they are now supported too by this class. This way nothing bad happens when it is called from Graph.cliques_maximal.
Patch updated ! ;-)
Nathann
Changed 8 years ago by
comment:18 Changed 8 years ago by
- Description modified (diff)
I think this patch is useful. I was looking for something to count independents sets of a given size since they index certain triangulations of polytopes coming from flows on graphs (#14654).