Opened 10 years ago
Closed 9 years ago
#5793 closed enhancement (fixed)
[with patch, positive review] New algorithm for Max Clique in Graph class using Cython
Reported by: | ncohen | Owned by: | rlm |
---|---|---|---|
Priority: | minor | Milestone: | sage-4.1.1 |
Component: | graph theory | Keywords: | |
Cc: | rlm | Merged in: | Sage 4.1.1.rc1 |
Authors: | Nathann Cohen | Reviewers: | Robert Miller, Minh Van Nguyen |
Report Upstream: | Work issues: | ||
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
This depends on #6355.
Attachments (9)
Change History (48)
comment:1 Changed 10 years ago by
- Milestone set to sage-3.4.2
- Summary changed from [with patch, needs review] New algorithm for Max Clique in Graph class using Cython to [with patch, needs work] New algorithm for Max Clique in Graph class using Cython
comment:2 Changed 10 years ago by
A new patch using the SPKG you can find at : http://trac.sagemath.org/sage_trac/ticket/6355
comment:3 Changed 10 years ago by
- Cc rlm added
comment:4 Changed 10 years ago by
- Resolution set to duplicate
- Status changed from new to closed
Duplication of #6355.
comment:5 Changed 10 years ago by
To get this going again, here are some specific suggestions for patch 12427.patch and the associated spkg:
- copy the header files to somewhere in $SAGE_LOCAL/include (maybe $SAGE_LOCAL/include/cliquer/*.h
- Make a cliquer.pxd file that declares the necessary functions and structs from the header files. Make sure that you are only doing cdef externs from the header files, not the .c files like the patch is currently doing. Don't include the path; just do
cdef extern from "cliquer/graph.h":
since the header is in the include path. This cliquer.pxd file can go into sage/graphs.
- Make a cliquer.pyx in sage/graphs that contains the definitions of max_clique, all_max_clique, and clique_number. Document and add doctests to these functions.
- Delete the lines
from sage.graphs.graph import Graph #from distutils.core import setup #from distutils.extension import Extension from Cython.Distutils import build_ext
- In module_list.py, add a
libraries = ['cliquer']
option (see the surrounding declarations for examples of this).
- Compile the cliquer sources into a shared library, named libcliquer.so, using the instructions in the HowTo? William posted. Place this shared library into $SAGE_LOCAL/lib.
- In the graphs.py, I don't think you need to do
from sage.graphs.cliquer import clique_number
. Since the module will be in that directory, I think it would be sufficient to dofrom cliquer import clique_number
.
comment:6 Changed 10 years ago by
- Resolution duplicate deleted
- Status changed from closed to reopened
rlm: this ticket has what looks like the most current patch, whereas #6355 does not have a patch. I'm reopening this ticket so that the "active" patch is not closed (but I am not opposed to moving this patch to #6355 to consolidate things!). I feel a bit silly commenting on a patch here when the ticket is closed, but the patch clearly has not been moved/merged/etc.
comment:7 Changed 10 years ago by
Suggestions 1 and 6 in my comment above apply to the spkg, not the patch.
comment:8 Changed 10 years ago by
Whence the "../../../../local/lib/cliquer-1.2/cliquer.h" if there's no spkg. Also, I belive local/include is in the include path, so no need to have this long path. (And include files belong in local/include, not local/lib.) I would probably drop the 1.2 suffix so we don't have to update the link paths every time the version gets bumped.
comment:9 follow-up: ↓ 10 Changed 10 years ago by
that's what I meant by point 2 (just do cdef extern from "cliquer/graph.h"
) and point 1 (put the headers in $SAGE_LOCAL/include/cliquer/*.h)
comment:10 in reply to: ↑ 9 Changed 10 years ago by
Here is a new patch ( number 12428 ) using a shared library !!! I hope you will appreciate it as it took me some time to figure out the inner behavior of Sage ;-)
The new spkg is available on http://trac.sagemath.org/sage_trac/ticket/6355
Nathann
comment:11 Changed 10 years ago by
My version is working, but I am a bit lost with all the patches... How can I produce a patch containing all the modifications I made since the begining ( since I cloned the original branch, actually ) ?
Thanks !!!
comment:12 Changed 10 years ago by
One option is to clone the branch using hg_sage.clone
, but to give it the revision number corresponding to the last patch not part of your changes (the base). Then you can copy the affected files over, and it is as if you have done all the work, but you have not checked anything in.
comment:13 Changed 10 years ago by
New patch "cliquer.patch" containing all the modifications for cliquer since the beginning. And with the good directory's name ;-)
comment:14 Changed 10 years ago by
Nathann,
I have deleted the previous patches to avoid confusion.
When addressing the following issues, please post another patch to be applied on top of the first (this makes review easier).
algorithm=='networkx'
is never tested, and if it were, it would fail, due to thecliques
parameter
- Why are you using
cliquer.pxi
instead ofcliquer.pxd
? If it werepxd
instead, then other Cython files couldcimport
the same data types.
- It should go:
if algorithm=='networkx': ... elif algorithm=='cliquer': ... else: raise NotImplementedError("Only 'networkx' and 'cliquer' are supported.")
Also you need to change one instance of 'Cliquer'
to 'cliquer'
.
- In maximum_cliques, "vertex set" should be "vertex sets"
comment:15 follow-up: ↓ 17 Changed 10 years ago by
I hope all is fixed now... Though I just renamed cliquer.pxi to cliquer.pxd without really understanding the difference ^{};
By the way, I am not really sure this possibility to change the algorithm used to compute the cliquer number is that useful... Just take a look at this :
sage: g=graphs.RandomGNP(100,.5) sage: time g.clique_number(algorithm="networkx") CPU times: user 2.49 s, sys: 0.00 s, total: 2.49 s Wall time: 2.49 s 9 sage: time g.clique_number() CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s Wall time: 0.01 s 9 sage: g=graphs.RandomGNP(150,.5) sage: time g.clique_number(algorithm="networkx") CPU times: user 18.45 s, sys: 0.04 s, total: 18.49 s Wall time: 18.49 s 10 sage: time g.clique_number() CPU times: user 0.03 s, sys: 0.00 s, total: 0.03 s Wall time: 0.03 s 10 sage: g=graphs.RandomGNP(200,.5) sage: time g.clique_number(algorithm="networkx") CPU times: user 82.33 s, sys: 0.18 s, total: 82.52 s Wall time: 82.54 s 11 sage: time g.clique_number() CPU times: user 0.07 s, sys: 0.00 s, total: 0.07 s Wall time: 0.07 s 11
Anyway, from the practical point of view, it works now ;-)
Nathann
comment:16 Changed 10 years ago by
- Summary changed from [with patch, needs work] New algorithm for Max Clique in Graph class using Cython to [with patch, needs review] New algorithm for Max Clique in Graph class using Cython
comment:17 in reply to: ↑ 15 Changed 10 years ago by
- Summary changed from [with patch, needs review] New algorithm for Max Clique in Graph class using Cython to [with patch, needs work] New algorithm for Max Clique in Graph class using Cython
Replying to ncohen:
By the way, I am not really sure this possibility to change the algorithm used to compute the cliquer number is that useful...
There are plenty of reasons to have two different implementations, and one I can think of right off the top of my head is to compare results for correctness.
- You have removed the input "cliques," which is used by NetworkX. You need to put this back in, and mention that it is ignored unless the algorithm is "networkx."
- The
include '../ext/cliquer.pxd'
line at the beginning of cliquer.pyx is not necessary. "include" is a plain text include, andpxd
files are forward declarations, which get automatically included by Cython as long as the other part of the filename is the same.
- One thing I should have mentioned last round: the three functions introduced in cliquer.pyx need to be documented, including some nontrivial doctests.
After all this is done, we should be very close to finished!
comment:18 follow-up: ↓ 19 Changed 10 years ago by
Hmmmmmm.... There is a perhaps-not-so-small problem :
G = Graph({0:[1,2,3], 1:[2], 3:[0,1]}) G.maximum_clique() 2
When it is obviously 3...
What do we do in this situation ? ^{};
comment:19 in reply to: ↑ 18 Changed 10 years ago by
Replying to ncohen:
What do we do in this situation ? ^{};
We debug! I'm looking into this now, but I'm not sure what I'll be able to find out...
- Another thing is that
cliquer.pxd
needs to be in the same directory ascliquer.pyx
.
comment:20 Changed 10 years ago by
it does not come from cliquer itself, which is a relief. I ran cliquer on the command lineand on this file : p edges 4 5 e 1 2 e 1 3 e 1 4 e 2 3 e 2 4
~/cliquer-1.2$ ./cl example Reading graph from example...OK Searching for a single maximum weight clique...
2/4 (max 2) 0.00 s (0.00 s/round) 3/4 (max 3) 0.00 s (0.00 s/round) 4/4 (max 3) 0.00 s (0.00 s/round)
size=3, weight=3: 1 2 4 ~/cliquer-1.2$
comment:21 Changed 10 years ago by
It works when I replace :
buf=' e %d %d' %(u,v)
by
buf=' e %d %d' %(u+1,v+1)
in cliquer.pyx. I thought it may come from the difference between 0....n-1 and 1...n and it seems correct. I am running other tests with NetworkX to check.
comment:22 Changed 10 years ago by
It seems as if Cliquer is subtracting one from each vertex in the input, and so the input needs to have one added to it. To see this, apply trac_5793_debug_only.patch
, and you get:
sage: G = Graph({0:[1,2,3], 1:[2], 3:[0,1]}); G.maximum_clique() (0, 1, None) e 0 1 Unweighted graph has 4 vertices, 0 edges (density 0.00). 0 -> 1 -> 2 -> 3 -> (0, 2, None) e 0 2 Unweighted graph has 4 vertices, 0 edges (density 0.00). 0 -> 1 -> 2 -> 3 -> (0, 3, None) e 0 3 Unweighted graph has 4 vertices, 0 edges (density 0.00). 0 -> 1 -> 2 -> 3 -> (1, 2, None) e 1 2 Unweighted graph has 4 vertices, 1 edges (density 0.17). 0 -> 1 1 -> 0 2 -> 3 -> (1, 3, None) e 1 3 Unweighted graph has 4 vertices, 2 edges (density 0.33). 0 -> 1 2 1 -> 0 2 -> 0 3 -> [0, 2]
You should probably also expose at least this print_graph
function in the official versions.
comment:23 Changed 10 years ago by
Oops, looks like we were duplicating effort! But I'll be online for a while, so this might get positively reviewed today!
comment:24 Changed 10 years ago by
- Summary changed from [with patch, needs work] New algorithm for Max Clique in Graph class using Cython to [with patch, needs review] New algorithm for Max Clique in Graph class using Cython
New patch !!
Changed 10 years ago by
comment:25 Changed 10 years ago by
- Summary changed from [with patch, needs review] New algorithm for Max Clique in Graph class using Cython to [with patch, needs work] New algorithm for Max Clique in Graph class using Cython
OK, I have a few more requests:
- I think you should expose Cliquer's
graph_print
function incliquer.pxd
, so other people see it when they try to play around with the interface.
- In general, doctests should be indented, so e.g.
EXAMPLES:: sage: C=graphs.PetersenGraph() sage: max_clique(C) [7, 9] """
should be
EXAMPLES:: sage: C=graphs.PetersenGraph() sage: max_clique(C) [7, 9] """
- In fact, you should probably check out
http://www.sagemath.org/doc/developer/conventions.html
For example, the following belongs in an INPUT:
block, and would look much more professional there:
The parameter 'cliques' is an optional list of cliques that can be input if already computed. ONLY USED BY NetworkX !!!
- There are several functions in
graph.py
which compute with cliques, which haven't been exposed to this new interface. For example,G.cliques()
. It would be nice to take care of those functions now too, so people don't unnecessarily complain about these functions being slow later. They are all in the same part of the file, which starts with### Cliques
.
comment:26 Changed 10 years ago by
- There is one other thing to be aware of. You are assuming that the vertex set is *always* {0,...,n-1}. In fact, this will cause some trouble:
sage: C = graphs.CubeGraph(4) sage: C.maximum_clique() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ... TypeError: cannot concatenate 'str' and 'int' objects sage: C.vertices()[0] '0000'
Here is the idea to get around this, since this problem has come up many times before:
sage: C = graphs.CubeGraph(4) sage: G,d = C.relabel(inplace=False, return_map=True) sage: d_inv = {} sage: for v in d: ....: d_inv[d[v]] = v ....: sage: C.vertices() ['0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111'] sage: G.vertices() [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] sage: d {'0000': 0, '0001': 1, '0010': 2, '0011': 3, '0100': 4, '0101': 5, '0110': 6, '0111': 7, '1000': 8, '1001': 9, '1010': 10, '1011': 11, '1100': 12, '1101': 13, '1110': 14, '1111': 15} sage: d_inv {0: '0000', 1: '0001', 2: '0010', 3: '0011', 4: '0100', 5: '0101', 6: '0110', 7: '0111', 8: '1000', 9: '1001', 10: '1010', 11: '1011', 12: '1100', 13: '1101', 14: '1110', 15: '1111'}
comment:27 follow-up: ↓ 28 Changed 9 years ago by
- Summary changed from [with patch, needs work] New algorithm for Max Clique in Graph class using Cython to [with patch, needs review] New algorithm for Max Clique in Graph class using Cython
Patch number 4 :
- Cliquer now deals with graphs with vertex labels not in 0..n-1
- Some functions of Graph could use cliquer and now can. ( some functions dealing with MAXIMAL cliques instead of MAXIMUM cliques can not use cliquer, though ! )
- Some documentations have been rearranged
- Cliquer's graph_print function is added to cliquer.pxd if needed
Changed 9 years ago by
comment:28 in reply to: ↑ 27 Changed 9 years ago by
OK, you've done a ton of work on this. There are a few minor details left, but I will take care of them myself, and post a referee patch. This ticket should be ready today.
PS -- A maximum clique is just a clique of maximal size, so those functions are also applicable to Cliquer, but I'll update those myself. :)
comment:29 follow-up: ↓ 30 Changed 9 years ago by
Thank you very much !! ;-)
Concerning the cliques, I agree when you say that a "A maximum clique is just a clique of maximal size", but a "maximal clique" is a clique such that there is no bigger clique in the graph -- according to the subset inclusion order (all the maximal cliques of a graph need not have the same cardinality) -- and this I what I thought I had read in the descriptions of the others functions.
And.... now that this seems to be finished, I only have left to take care of the LP/MIP patch, which seems quite another adventure... ;-)
Thanks again !
Changed 9 years ago by
comment:30 in reply to: ↑ 29 Changed 9 years ago by
Replying to ncohen:
Concerning the cliques, I agree when you say that a "A maximum clique is just a clique of maximal size", but a "maximal clique" is a clique such that there is no bigger clique in the graph -- according to the subset inclusion order (all the maximal cliques of a graph need not have the same cardinality) -- and this I what I thought I had read in the descriptions of the others functions.
I had realized this some time after writing that comment. Please forgive me.
comment:31 Changed 9 years ago by
- Reviewers set to Robert Miller
- Summary changed from [with patch, needs review] New algorithm for Max Clique in Graph class using Cython to [with patch, positive review] New algorithm for Max Clique in Graph class using Cython
comment:32 Changed 9 years ago by
- Description modified (diff)
comment:33 Changed 9 years ago by
In the last patch:
- I made all clique related functions start with
clique_
so that you can doG.clique<tab>
and see all of them.
- Added a few doctests here and there
- Deprecated cliques, since it is ambiguous.
comment:34 Changed 9 years ago by
- Summary changed from [with patch, positive review] New algorithm for Max Clique in Graph class using Cython to [with patch, needs work] New algorithm for Max Clique in Graph class using Cython
With the SPKG at #6355 and the patch on this ticket, I got the following doctest failures:
sage -t -long devel/sage-main/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx ********************************************************************** File "/scratch/mvngu/release/sage-4.1.1.alpha0/devel/sage-main/sage/groups/perm_gps/partn_ref/refinement_graphs.pyx", line 318: sage: clqs = (HS.complement()).cliques() Expected nothing Got: doctest:1: DeprecationWarning: The function 'cliques' has been deprecated. Use 'cliques_maximal' or 'cliques_maximum'. ********************************************************************** 1 items had failures: 1 of 89 in __main__.example_2 ***Test Failed*** 1 failures. For whitespace errors, see the file /scratch/mvngu/release/sage-4.1.1.alpha0/tmp/.doctest_refinement_graphs.py [231.6 s] <SNIP> sage -t -long devel/sage-main/sage/graphs/graph_coloring.py ********************************************************************** File "/scratch/mvngu/release/sage-4.1.1.alpha0/devel/sage-main/sage/graphs/graph_coloring.py", line 208: sage: chromatic_number(G) Expected: 3 Got: doctest:224: DeprecationWarning: The function 'cliques' has been deprecated. Use 'cliques_maximal' or 'cliques_maximum'. 3 ********************************************************************** 1 items had failures: 1 of 7 in __main__.example_5 ***Test Failed*** 1 failures. For whitespace errors, see the file /scratch/mvngu/release/sage-4.1.1.alpha0/tmp/.doctest_graph_coloring.py [2.3 s]
comment:35 Changed 9 years ago by
- Summary changed from [with patch, needs work] New algorithm for Max Clique in Graph class using Cython to [with patch, needs review] New algorithm for Max Clique in Graph class using Cython
Note: making the refinement_graphs
doctest use Cliquer instead of NetworkX shaves about 21 seconds from the doctest time for the file!
comment:36 Changed 9 years ago by
I do not really know how to read patch files, but it seems to me you replaced : m = max([len(c) for c in G.cliques()]) by m = max([len(c) for c in G.cliques_maximum()])
If right, I have two remarks :
- The syntax of max([len(c) for c in G.cliques()]) makes me think that G.cliques() was meant to return the list of the maximAL cliques, and m is then meant to be the clique number of G. The expression max([len(c) for c in G.cliques_maximum()]) is syntaxically correct, thought as cliques_maximum returns only the maxiMUM cliques we may as well return the length of the first one as they all have the same size.
- In the end, and if I make no mistake, why about just setting m=G.clique_number() ?
I expect it to be way faster than an enumeration of all the cliques ! ;-)
comment:37 Changed 9 years ago by
Nathann,
Thanks for spotting that. The patch is updated!
comment:38 Changed 9 years ago by
- Reviewers changed from Robert Miller to Robert Miller, Minh Van Nguyen
- Summary changed from [with patch, needs review] New algorithm for Max Clique in Graph class using Cython to [with patch, positive review] New algorithm for Max Clique in Graph class using Cython
I applied patches in the following order:
- the SPKG at #6355
trac_5793-cliquer-flattened.patch
trac_5793-part-6.patch
All doctests passed with Sage 4.1.1.rc0. So positive review.
comment:39 Changed 9 years ago by
- Merged in set to Sage 4.1.1.rc1
- Resolution set to fixed
- Status changed from reopened to closed
Hi,
the patch as is will not go into Sage since you are putting the sources for cliquer-1.2 into the Sage library tree where they do not belong.
I have not looked at the cython interface, but my guess would be that we need to change cliquer-1.2 to create a library. Please split the cython interface work (the code you wrote minus the cliquer-1.2 source code) and the cliquer-1.2.spkg work. The Cython code should remain here while spkg should go to #5669.
Cheers,
Michael