Opened 8 years ago

Last modified 7 years ago

#13917 closed enhancement

IndependentSets class — at Version 18

Reported by: ncohen Owned by: jason, ncohen, rlm
Priority: major Milestone: sage-6.2
Component: graph theory Keywords:
Cc: vdelecroix Merged in:
Authors: Nathann Cohen Reviewers: ahmorales
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #14589 Stopgaps:

Status badges

Description (last modified by ncohen)

I'm rather disappointed by this patch. Or I'm pleasantly surprised by Python, I don't know. I expected a much better speedup.

sage: g = graphs.RandomGNP(50,.7)
sage: import networkx
sage: gn = g.networkx_graph()
sage: %timeit list(networkx.find_cliques(gn))
10 loops, best of 3: 48.8 ms per loop
sage: from sage.graphs.independent_sets import IndependentSets
sage: %timeit list(IndependentSets(g,complement=True, maximal = True))
10 loops, best of 3: 20.8 ms per loop
sage: %timeit IndependentSets(g,complement=True, maximal = True).cardinality()
100 loops, best of 3: 19 ms per loop

Either way, this patch implements a sage.graphs.independent_set module that can list/count (possibly maximal) independent sets.

Anyway, don't hesitate to tell me that this patch is useless if you feel like it. I'm not *that* convinced either.

Nathann

Apply:

Change History (22)

comment:1 Changed 8 years ago by ncohen

  • Status changed from new to needs_review

comment:2 Changed 8 years ago by ahmorales

I think this patch is useful. I was looking for something to count independents sets of a given size since they index certain triangulations of polytopes coming from flows on graphs (#14654).

comment:3 Changed 8 years ago by ahmorales

  • Reviewers set to ahmorales

Changed 8 years ago by ncohen

comment:4 Changed 8 years ago by vdelecroix

  • Cc vdelecroix added

comment:5 Changed 8 years ago by vdelecroix

Hi,

People from statistical physics are interested in finer quantities than the number of independent configurations (independent sets are related to particule interactions). In their case each configuration is counted with a weight of the form $\prod_{v \in I} \lambda(v)$ where $\lambda: V \rightarrow \mathbb{R}_+$ is a function on the vertices. Considering the case of $\lambda$ being constant, the code could even return the polynomial $f(\lambda)$ whose coefficient for $\lambda^k$ is the number of independent set with $k$ vertices (or equivalently the list of number of independent set of given size). Perhaps this polynomial has a name in graph theory ?

I let the serious comments to the reviewer.

Vincent

comment:6 follow-up: Changed 8 years ago by vdelecroix

  • Description modified (diff)
  • Status changed from needs_review to needs_work
  • Work issues set to design, documentation

0) I found your patch useful!

0') I thought that in your timing most of the time was lost in the creation of the list and the for loop... but no: it is not much faster to enumerate all independent set rather than counting them with the method cardinality (I update the example in the description). So, I am as well surprised by Python/Cython.

Todo

1) I am not sure the following is what we want

sage: I = independent_sets.IndependentSets(graphs.PetersenGraph())
sage: i1 = iter(I); sage: i2 = iter(I)
sage: i1.next()
[0]
sage: i2.next()
[0, 2]
sage: i1.next()
[0, 2, 6]
sage: i2.next()
[0, 2, 8]

In other words an iterator is an iterator and nothing else. If you want to design the set of independent sets it should be an object different from the iterator. It would be better to have a Python class IndependentSets desinged to modelize the set of independent sets (it should inherit from Parent :P) and a Cython iterator IndependentSetIterator. What do you think?

2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?

3) The flag count_only allows to count the number of independent set in only one call to __next__ (if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch the StopIteration.

Possible improvements

1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.

Documentation

1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.

2) wikipedia links might be useful :wikipedia:Independent_set_(graph_theory), :wikipedia:Dominating_set, :wikipedia:Clique_graph

comment:7 in reply to: ↑ 6 ; follow-up: Changed 8 years ago by ncohen

Yoooooooooooo !!

0) I found your patch useful!

Cool !

0') I thought that in your timing most of the time was lost in the creation of the list and the for loop... but no: it is not much faster to enumerate all independent set rather than counting them with the method cardinality (I update the example in the description). So, I am as well surprised by Python/Cython.

Yep O_o

1) I am not sure the following is what we want

Ok. So what you want is an empty class that does nothing, which calls another class that does all the job ? It cannot be just an independent iterator, for I also want to compute the number of cliques without having to compute all the lists of elements at Python level.

In other words an iterator is an iterator and nothing else.

Says the dogma.

If you want to design the set of independent sets it should be an object different from the iterator.

Says the dogma.

It would be better to have a Python class IndependentSets desinged to modelize the set of independent sets

And what on earth would it do that is not a feature of the iterator ?

(it should inherit from Parent :P)

Over my dead body.

and a Cython iterator IndependentSetIterator. What do you think?

I will not create an empty class if I can avoid it.

2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?

Actually I avoided it on purpose. I felt like this thing was not interesting for everybody, and that it should belong to a module one would load when needed, rather than add a new function to Graph.... What do you think ? It's not like one wants to enumerate independent sets every other day.

3) The flag count_only allows to count the number of independent set in only one call to __next__ (if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch the StopIteration.

I will rewrite this part using $14589 as a dependency. It's a bit stupid to deal with <64 vertices graphs only.

1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.

And it already is :

n_sets = [0]*g.order()
for s in IndependentSets(g):
    n_sets[len(s)] += 1

What do you want ? Ugly "if" everywhere in the main loop to filter this before returning them ?...

1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.

The complement of an independent set is a vertex cover (and very often a dominating set too, unless the graph has isolated vertices), but an independent set of the <complement of the graph> is a clique. Was that the problem ?

2) wikipedia links might be useful :wikipedia:Independent_set_(graph_theory), :wikipedia:Dominating_set, :wikipedia:Clique_graph

Right, right. I will add them to the next version of this patch.

Nathann

comment:8 in reply to: ↑ 7 Changed 8 years ago by vdelecroix

Replying to ncohen:

Hey!

(it should inherit from Parent :P)

Over my dead body.

(-: you should love your parents :-)

and a Cython iterator IndependentSetIterator. What do you think?

I will not create an empty class if I can avoid it.

And what about

sage: from sage.graphs.independent_sets import IndependentSets
sage: I = IndependentSets(g,complement=True, maximal = True)
sage: I.next()
[0, 1, 2, 9, 19, 35, 40, 43, 47]
sage: I.cardinality()
4457
sage: I.next()
Traceback (most recent call last):
...
StopIteration:

Your class is in between:

  • an iterator
  • an object that modelises the set of independent sets (the methods cardinality and __contains__)

Doing both purposes at the same time implies many unsatisfactory behaviors. What I suggest is an iterator that iterates and a class that answers to __contains__ and cardinality. But this is not satisfactory either because you still have to hack the iterator to get the cardinality.

It might be your choice to have a multiple purposes class. If so, specify it in the documentation as : "you can either use the class to iterate (though only once) or compute the cardinality, but do not think that you can do both in a clean way".

2) Could you provide a method to graph (independent_set_iterator or/and independent_sets)?

Actually I avoided it on purpose. I felt like this thing was not interesting for everybody, and that it should belong to a module one would load when needed, rather than add a new function to Graph.... What do you think ? It's not like one wants to enumerate independent sets every other day.

I agree!

3) The flag count_only allows to count the number of independent set in only one call to __next__ (if you do replace "return []" by "if not self.count_only: return []"). Your solution is a bit hackish and moreover the "for i in self: pass" needs to catch the StopIteration.

I will rewrite this part using $14589 as a dependency. It's a bit stupid to deal with <64 vertices graphs only.

1) With the current implementation it would be easy to provide the number of independent set of each cardinality and to iterate through the set of independent set of given cardinality.

And it already is :

n_sets = [0]*g.order()
for s in IndependentSets(g):
    n_sets[len(s)] += 1

What do you want ? Ugly "if" everywhere in the main loop to filter this before returning them ?...

No. What I meant is to consider a more general attribute .count of type int[64] which contains the number of independent set of each cardinality. But your solution is satisfactory and you can keep the implementation as it is. Could you add your clever solution to the documentation?

1) The complement of an independent set is a dominating set, right (not a clique) ? If so, please change the documentation accordingly.

The complement of an independent set is a vertex cover (and very often a dominating set too, unless the graph has isolated vertices), but an independent set of the <complement of the graph> is a clique. Was that the problem ?

My mistake.

comment:9 Changed 8 years ago by ncohen

(new patch that works with >64 vertices, based on top of 14589. And even slower than the first one >_<)

Nathann

comment:10 Changed 8 years ago by ncohen

  • Description modified (diff)

comment:11 Changed 8 years ago by ncohen

  • Dependencies set to #14589
  • Status changed from needs_work to needs_review

Newwwwwwww patch !!!!

  • It now deals with graphs of arbitrary size (using the binary matrices from #14589)
  • I spent quite some time complaining to myself (and to Florent) about creating an empty IndependentSets class which would call an IndependentSetsIterator class. He also agreed with you Vincent, and even though I did not like this problem with double loops either, I still refused to write an empty class.

Because I hate empty classes.

Turns out that there is a nice way out, that I stupidly did not notice : instead of writing a .next() method, I moved the code to .__iter__(), and replaced the return by yield. And many global variables are now local to .__iter__(), and everything goes well under the sun :-P

sage: g = graphs.PetersenGraph()
sage: from sage.graphs.independent_sets import IndependentSets                                        
sage: IS = IndependentSets(g)
sage: IS2 = [(x,y) for x in IS for y in IS]
sage: len(IS2)
5776
sage: IS.cardinality()**2
5776
  • I added a comment about the difference in speed between this implementation and NetworkX
  • Annnnnnd I added some links toward Wikipedia.

Tell me if you like it :-)

Nathann

Changed 8 years ago by ncohen

Changed 8 years ago by vdelecroix

comment:12 follow-up: Changed 8 years ago by vdelecroix

  • Status changed from needs_review to needs_work
  • Work issues design, documentation deleted

You are a clever guy as you found a solution that satisfy both of us!

Some new comments

0) Nobody would access the documentation in __iter__ or __contains__. I modified the documentation accordingly.

1) you should move bitset_are_disjoint to sage.misc.bitset as somebody else may want to use it.

2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.

3)

sage: G = graphs.CubeGraph(5)
sage: I = IndependentSets(G)
sage: [0,1,2,3] in I
BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!!

4)

sage: IndependentSets(graphs.EmptyGraph())
REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!!

comment:13 in reply to: ↑ 12 ; follow-up: Changed 8 years ago by ncohen

  • Status changed from needs_work to needs_info

Hellooooooooo !!

0) Nobody would access the documentation in __iter__ or __contains__. I modified the documentation accordingly.

Oh. Right.

1) you should move bitset_are_disjoint to sage.misc.bitset as somebody else may want to use it.

Hmmmm... I did not move it there for the function does not take bitsets as input, but unsigned long * pointers... So I felt that it did not belong there. If you are convinced that it should be there, no problem and I will move it. I really had no idea.

2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.

Hmmm.. -_- Right.

Do you want me to add options to list independent sets with a special filter according to the number of vertices ?

3)

sage: G = graphs.CubeGraph(5)
sage: I = IndependentSets(G)
sage: [0,1,2,3] in I
BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!!

4)

sage: IndependentSets(graphs.EmptyGraph())
REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!!

Oh. I saw your ProgrammerError in the patch, but there is no segfault on my side. When I run your commands, here is what I get :

sage: from sage.graphs.independent_sets import IndependentSets                                                                                                                                 sage: from sage.graphs.independent_sets 
sage: from sage.graphs.independent_sets import IndependentSets
sage: G = graphs.CubeGraph(5)
sage: I = IndependentSets(G)
sage: [0,1,2,3] in I
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-a8f48211b650> in <module>()
----> 1 [Integer(0),Integer(1),Integer(2),Integer(3)] in I

/home/ncohen/.Sage/local/lib/python2.7/site-packages/sage/graphs/independent_sets.so in sage.graphs.independent_sets.IndependentSets.__contains__ (build/cythonized/sage/graphs/independent_sets.c:5633)()

ValueError: An element of the set being tested does not belong to 

Admittedly the error message has to be fixed, but no segfault. O_o

The empty graph produces a segfault allright. But of course the empty graph is not really a graph :-P

Tell me what you think, and I will write an additional patch. And thank you for your review !

Nathann

comment:14 in reply to: ↑ 13 ; follow-up: Changed 8 years ago by vdelecroix

Replying to ncohen:

1) you should move bitset_are_disjoint to sage.misc.bitset as somebody else may want to use it.

Hmmmm... I did not move it there for the function does not take bitsets as input, but unsigned long * pointers... So I felt that it did not belong there. If you are convinced that it should be there, no problem and I will move it. I really had no idea.

I see why now. I thought that a bitset was a "unsigned long *" but it is a struct that contains it. is nothing more than an "unsigned long *". I would prefer if you move your function and make it takes bitset_s (= bitset_t *) as input.

2) your example to iterate over the independent set of given cardinality is nice to learn Python but not to learn a good algorithm... If I want the independent sets of cardinality 1 of a grid graph 12x12 it will take an hour just to obtain my 144 independent sets.

Hmmm.. -_- Right.

Do you want me to add options to list independent sets with a special filter according to the number of vertices ?

I see two solutions:

  • You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea
  • add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)

3) [...] BOOOOOOOOOOOOOOOOOOOOOOOOOOOOM!!! [...] REBOOOOOOOOOOOOOOOOOOOOOOOOOOM!! [...]

Oh. I saw your ProgrammerError in the patch, but there is no segfault on my side. [...] Admittedly the error message has to be fixed, but no segfault. O_o

The empty graph produces a segfault allright. But of course the empty graph is not really a graph :-P

Your computer is better than mine ;-) What sage version are you running ? I will compile the rc1 and see what I get (I was on 5.10.beta4).

comment:15 in reply to: ↑ 14 ; follow-up: Changed 8 years ago by ncohen

Yo !

I see why now. I thought that a bitset was a "unsigned long *" but it is a struct that contains it. is nothing more than an "unsigned long *". I would prefer if you move your function and make it takes bitset_s (= bitset_t *) as input.

I really need a function that takes unsigned long * as input, because this is what the binary matrices store. Inside of a bitset structure, there is also a .bits field that is of type unsigned long *, but if I want to convert my unsigned long * to a bitset, then I need either to allocate a new array or to mess with a bitset data structure before calling the new version of the function that would take bitsets as input. And it looks like a waste of time.

I really need a function that takes unsigned long * as input.

  • You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea

I can do it. I don't mind.

  • add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)

I hope that this will not change the speed of the computations, though.

Your computer is better than mine ;-) What sage version are you running ?

Sage-5.10.rc0. But I don't see why mine would not see a segfault that occurs on yours. Usually, the one who gets the segfaults is right, and the one who does not get it is just lucky. There may be a problem somewhere.

I will compile the rc1 and see what I get (I was on 5.10.beta4).

Okayyyyyyy.

Still waiting for your answer before I write the additional patch.

Nathann

comment:16 in reply to: ↑ 15 ; follow-up: Changed 8 years ago by vdelecroix

Replying to ncohen:

A stupid comment I was doing myself:

The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...

  • You do not want to do the change, then mention, just next to the example, that this is not necessarily a good idea

I can do it. I don't mind.

  • add a special filter according to the number of vertices (it is straightforward to add constants min_num_verts and max_num_verts within your iterator)

I hope that this will not change the speed of the computations, though.

If you have the time to try, please do. I can not imagine that you will see a difference with two test conditions. Otherwise, the first solution is ok.

Your computer is better than mine ;-) What sage version are you running ?

Sage-5.10.rc0. But I don't see why mine would not see a segfault that occurs on yours. Usually, the one who gets the segfaults is right, and the one who does not get it is just lucky. There may be a problem somewhere.

I will compile the rc1 and see what I get (I was on 5.10.beta4).

on rc1 it works fine...

comment:17 in reply to: ↑ 16 Changed 8 years ago by ncohen

  • Status changed from needs_info to needs_review

Hellooooooooooo !!

The set of independent sets has an acylcic oriented graph structure (two ind. set are neighbors if you pass from one to the other by removing or adding a vertex). What you do in your algorithm is that, by considering an order on the vertices, you can easily build a tree out of this acyclic graph. So, in principle, it would possible to explore the graph with much less backtracking... I wonder if this is along those lines that something clever is done in networkx...

Hmmmmm... I don't get it O_o

If you do nothing but talk of the exploration tree then I do not see how you can beat this implementation : it goes through this tree and only remembers the name of the current node... I don't see it getting much cheaper O_o

I can do it. I don't mind.

Hmmm... Turns out that when looking at the code again, it required to store a variable containing the current number of elements in the set, and add +=1 and -=1 in several places... Soooooo if you don't need it I would prefer not to implement this here. I added a message saying that the iterator trick is not computationally smart.

on rc1 it works fine...

I fixed the segfault on empty graphs, and they are now supported too by this class. This way nothing bad happens when it is called from Graph.cliques_maximal.

Patch updated ! ;-)

Nathann

Changed 8 years ago by ncohen

comment:18 Changed 8 years ago by ncohen

  • Description modified (diff)
Note: See TracTickets for help on using tickets.