Opened 7 years ago
Last modified 5 years ago
#14529 needs_info enhancement
Drastic performance improvement of computing the chromatic polynomial
Reported by: | azi | Owned by: | jason, ncohen, rlm |
---|---|---|---|
Priority: | major | Milestone: | sage-6.4 |
Component: | graph theory | Keywords: | |
Cc: | ncohen, stefan, slani, rbeezer | Merged in: | |
Authors: | Reviewers: | ||
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
Hello!
I am not sure I have the energy to do that at the moment but its a nice task that we might want to address at some point. Hence I am leaving some stuff here for further reference.
The current implementation of the chromatic_polynomial uses the deletion-contraction recurrence directly. This gives us an exponential branching tree. The neat thing is that at some point some of the branches compute the chromatic polynomial for the same graphs (multiple - exponential) times!
A very dumb implementation in Python (see below) that is faster than the C code. It takes 700us to compute the chromatic polynomial of a random dense graph of order 15 and 74 minutes (!) with the C code currently used by Sage.
Of course this code needs more work and can be optimized even further (perhaps by calling the C implementation under a threshold) but for now here it is.
global cache cache = {} def chrompoly(G): global cache s = tuple(G.canonical_label().edges(labels=False)) if s in cache: return cache[s] if not G.is_connected(): return prod([chrompoly(C) for C in G.connected_components_subgraphs()]) R = ZZ['x'] x = R.gen() n = G.order() m = G.size() # Since we know that G is connected it is enough to check the number of edges # to test if it is a tree if m == n-1: return x*(x-1)**(n-1) u,v = G.edge_iterator(labels=False).next() G.delete_edge(u,v) p = chrompoly(G) G.add_edge(u,v) G2 = G.copy() G2.add_edges(((u,x) for x in G.neighbor_iterator(v))) G2.delete_vertex(v) ret = p - chrompoly(G2) cache[s] = ret return ret
Attachments (2)
Change History (23)
comment:1 Changed 7 years ago by
comment:2 Changed 7 years ago by
Looks fine!!
sage: G = graphs.RandomGNP(14,0.8) sage: %timeit G.chromatic_polynomial() 1 loops, best of 3: 50.7 s per loop sage: %timeit chrompoly(G) 1 loops, best of 3: 790 us per loop
comment:3 Changed 7 years ago by
- Description modified (diff)
comment:4 Changed 7 years ago by
Does anyone of you guys sees a reason not to remove the current Cython implementation and adding this code
comment:5 Changed 7 years ago by
Yep : the authors of the Cython code may be so impressed by your caching that they may want to adapt their code to it. Worth sending an email. Otherwise, if the results are better and the code clearer.... :-P
Nathann
comment:6 follow-up: ↓ 7 Changed 7 years ago by
Good!
How do you think we should handle the global cache if this is to be made into a proper patch?
What I would like to do is have a global ____sage_very_very_private__cache__chrompoly___dict = {}
variable outside the function and delete the cache each time the computation is over. In addition I would have a parameter keep_cache that keeps the cache alive (convenient when you're computing the polynomial in a loop)
What do you think?
Also whom would you send an email to?
comment:7 in reply to: ↑ 6 Changed 7 years ago by
Helloooooooo !
How do you think we should handle the global cache if this is to be made into a proper patch?
Hmmm... Well, there is the @cached_method decorator, that automatically manages the cache of any function. But you can't empty it automatically when the computations are done though.
Technically, what you can do is write a function A (without the decorator) which call function B (the real one, with the cached_method decorator). Before returning B's result, A would clear the cache of B.
I wouldn't go to such lengths before there is a real need, actually... If somebody computes one chromatic polynomial perhaps he will compute two, and ... well. It's up to you. To me caching the method with the cached_method decorator would do the job. and you can add in its documentation an explanation of how the cache can be cleared manually, with the clear_cache
method.
http://www.sagemath.org/doc/reference/misc/sage/misc/cachefunc.html
Also whom would you send an email to?
The file seems to have been written by Robert Miller and Gordon Royle (just reading the copyright in the file itself). Robert Miller does not work on Sage anymore but should be interested (he's the one who initially wrote all the graph-related stuff in Sage. Backends included). And Gordon Royle may be interested too, though I probably never wrote to him :-)
Nathann
comment:8 Changed 7 years ago by
Great thanks for the info! I'll read through that stuff and produce a proper patch.
As for the "compute one or two,.." I would for example like to test the Birkhoff-Lewis conjecture for all planar graphs of some order and having such a cache alive through all the computation would be very nice.
comment:9 Changed 7 years ago by
There is some similar code for caching, etc. in "tutte.sage" at #1314 .
comment:10 Changed 6 years ago by
- Cc ncohen stefan slani rbeezer added; ncohen stefan slani rbeezer removed
- Milestone changed from sage-5.11 to sage-5.12
Changed 6 years ago by
comment:11 Changed 6 years ago by
Attached it is a proposed patch for the chromatic polynomial.
The new code is short, self-explanatory and incorrect. Due to the nature of the incorrectness I suspect that the edge contraction part of the code may not be correct.
Anyone happens to see the issue?
comment:12 Changed 6 years ago by
How do you know that the graph is connected, recursively ? O_o
Nathann
Changed 6 years ago by
comment:13 Changed 6 years ago by
I don't.. It looks like I removed that check (which is present in the above code) when merging the patches.. FML... TNX for the spot on!
The code now works correctly!
comment:14 Changed 6 years ago by
- Status changed from new to needs_review
comment:15 Changed 6 years ago by
Needs some adjustments:
darij@travis-virtualbox:~/sage-5.11.beta3/devel/sage-main$ hg qpush applying trac_14592_chrompoly.patch patching file sage/graphs/chrompoly.pyx Hunk #1 FAILED at 0 1 out of 1 hunks FAILED -- saving rejects to file sage/graphs/chrompoly.pyx.rej patch failed, unable to continue (try -v) patch failed, rejects left in working dir errors during apply, please fix and refresh trac_14592_chrompoly.patch
It looks like undeleting the deleted pyx file does the trick.
comment:16 Changed 6 years ago by
The chromatic polynomial of the empty graph should be 1, not x as your code returns (nor should it throw a LookupError? as the original code did). While you might want to wait until the is_connected()
method correctly recognizes the empty graph as disconnected, please make sure that the returned value is actually the polynomial 1, not the integer 1 (so use ZZ['x'].prod
instead of prod
).
Nice speedup, though it would be better to take the best of both worlds and redo it in Cython. I'm getting slowdowns on small graphs (up to 6 vertices), which might be an issue with the combinatorics I'm doing (combinatorial Hopf algebras have me work with many many little graphs rather than few large ones); but I think the added speedup of Cython will get rid of these slowdowns easily.
comment:17 Changed 6 years ago by
is_connected()
is probably not going to handle the empty graph nicely, so it is best if you check for the empty graph explicitly.
comment:18 Changed 6 years ago by
- Status changed from needs_review to needs_info
Helloooooooooooooooo Jernej !
Would it be possible to keep the old implementation too, and have the two reachable through the same function ? Something like g.chromatic_polynomial(implementation="Python/C")
?
We would be able to check correction of each of them with the other, and if we ever notice that we can do better by caching inside of the C function.
Nathann
comment:19 Changed 6 years ago by
- Milestone changed from sage-6.1 to sage-6.2
comment:20 Changed 6 years ago by
- Milestone changed from sage-6.2 to sage-6.3
comment:21 Changed 5 years ago by
- Milestone changed from sage-6.3 to sage-6.4
Hm... Maybe I am being too fast with the time results. I should really check what happens if I delete the cache after each run