Opened 8 years ago
Closed 8 years ago
#13503 closed enhancement (fixed)
Enhancement of `is_triangle_free' addition of `triangles_count' and a minor change in `spanning_trees_count'
Reported by: | azi | Owned by: | jason, ncohen, rlm |
---|---|---|---|
Priority: | minor | Milestone: | sage-5.6 |
Component: | graph theory | Keywords: | triangles, graphs, number of triangles |
Cc: | rbeezer, ncohen | Merged in: | sage-5.6.beta1 |
Authors: | Jernej Azarija, David Coudert | Reviewers: | Jernej Azarija |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
The main change of this patch is to improve the `is_triangle_free' method. The introduced algorithm uses the fact that given that A is the adjacency matrix of a simple graph G, then the number of triangles in G is tr(A3)/6.
The change has been tested resulting in the following benchmarks:
- Testing all graphs of order up to 10 for triangles: (old: 201m44.339s, new: 87m38.146s)
- Testing all triangle-free graphs of order 12 for triangles: (old: 15m56.679s, new 9m41.116s)
- Testing all graphs of order up to 10 that do contain triangles: (old: 0m10.305s, new : 0m5.798s)
I can provide the test programs if needed.
Since there are many applications in which one needs to count the number of triangles in a graph the named routine was also added.
Finally a minor change has been made to the generic_graph function spanning_trees_count() which calls abs on a determinant that is always positive.
Apply:
Attachments (1)
Change History (40)
comment:1 Changed 8 years ago by
comment:2 Changed 8 years ago by
- Status changed from new to needs_review
comment:3 Changed 8 years ago by
- Cc rbeezer added
comment:4 Changed 8 years ago by
- Status changed from needs_review to needs_work
comment:5 Changed 8 years ago by
- Cc ncohen added
- Reviewers set to David Coudert
Hello,
I added Nathann in Cc because we had a discussion about counting triangles last week ;)
The proposed method for testing if the graph is triangle free is not a good one. Indeed, the time complexity for testing if the graph is triangle free is at most Nw, where 2 <= w < 3 is the best possible exponent for computing the square of a matrix of side N. You compute the square of the adjacency matrix to get all paths of length 2, and then you test for every edge (u,v) if there is a path of length 2 from u to v. Furthermore, the basic algorithm of testing if some neighbors of a vertex are incident is fast.
Let us compare the following method with your proposition for testing if the graph is triangle free.
def my_is_triangle_free(G): for u in G.vertex_iterator(): if any(G.has_edge(v,w) for v,w in combinations_iterator(G.neighbors(u),2)): return False # return True
We get the following running times:
sage: G = graphs.GridGraph([5,2]); M = G.adjacency_matrix() # 10 nodes sage: %timeit G.is_triangle_free() 625 loops, best of 3: 603 µs per loop sage: %timeit (M^3).trace() > 0 625 loops, best of 3: 101 µs per loop sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 376 µs per loop sage: sage: G = graphs.GridGraph([10,5]); M = G.adjacency_matrix() # 50 nodes sage: %timeit G.is_triangle_free() 25 loops, best of 3: 9.49 ms per loop sage: %timeit (M^3).trace() > 0 125 loops, best of 3: 3.92 ms per loop sage: %timeit my_is_triangle_free(G) 125 loops, best of 3: 2.67 ms per loop sage: sage: G = graphs.GridGraph([10,10]); M = G.adjacency_matrix() # 100 nodes sage: %timeit G.is_triangle_free() 25 loops, best of 3: 31.5 ms per loop sage: %timeit (M^3).trace() > 0 25 loops, best of 3: 12.3 ms per loop sage: %timeit my_is_triangle_free(G) 125 loops, best of 3: 5.71 ms per loop sage: sage: G = graphs.GridGraph([50,20]); M = G.adjacency_matrix() # 1000 nodes sage: %timeit G.is_triangle_free() 5 loops, best of 3: 1.1 s per loop sage: %timeit (M^3).trace() > 0 5 loops, best of 3: 1.04 s per loop sage: %timeit my_is_triangle_free(G) 5 loops, best of 3: 63.8 ms per loop sage: sage: G = graphs.RandomGNP(10,.1); M = G.adjacency_matrix() # 10 nodes sage: %timeit G.is_triangle_free() 625 loops, best of 3: 492 µs per loop sage: %timeit (M^3).trace() > 0 625 loops, best of 3: 90.5 µs per loop sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 160 µs per loop sage: sage: G = graphs.RandomGNP(50,.1); M = G.adjacency_matrix() # 50 nodes sage: %timeit G.is_triangle_free() 25 loops, best of 3: 9.92 ms per loop sage: %timeit (M^3).trace() > 0 125 loops, best of 3: 3.93 ms per loop sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 110 µs per loop sage: sage: G = graphs.RandomGNP(100,.2); M = G.adjacency_matrix() # 100 nodes, dense sage: %timeit G.is_triangle_free() 5 loops, best of 3: 40 ms per loop sage: %timeit (M^3).trace() > 0 25 loops, best of 3: 12.9 ms per loop sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 125 µs per loop sage: sage: G = graphs.RandomGNP(100,.01); M = G.adjacency_matrix() # 100 nodes, sparse sage: %timeit G.is_triangle_free() 25 loops, best of 3: 29.7 ms per loop sage: %timeit (M^3).trace() > 0 25 loops, best of 3: 12.2 ms per loop sage: %timeit my_is_triangle_free(G) 125 loops, best of 3: 2.05 ms per loop sage: sage: G = graphs.RandomGNP(1000,.01); M = G.adjacency_matrix() # 1000 nodes sage: %timeit G.is_triangle_free() 5 loops, best of 3: 33.9 s per loop sage: %timeit (M^3).trace() > 0 5 loops, best of 3: 33.8 s per loop sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 1.24 ms per loop
For very small graphs it is apparently faster to do (M^3).trace() > 0
, but for large graphs, basic iterations are way faster because we stop computations as soon as a triangle is found (if any). Therefore, a better solution could be an hybrid method using matrix multiplication if the graph is very small, and basic iterations otherwise.
For counting triangles, the situation is rather different and the density of the graph matters. Furthermore, it is interesting to use networkx graphs:
def my_triangles_count(G): tr = 0 for u in G.vertex_iterator(): tr += sum(G.has_edge(v,w) for v,w in combinations_iterator(G.neighbors(u),2)) return tr/3 def my_triangles_countnx(G): tr = 0 ggnx = G.networkx_graph() for u in ggnx.nodes_iter(): tr += sum(ggnx.has_edge(v,w) for v,w in combinations_iterator(ggnx.neighbors(u),2)) return tr/3
We get the following running time:
sage: G = graphs.GridGraph([5,2]) sage: %timeit G.triangles_count() 625 loops, best of 3: 608 µs per loop sage: %timeit my_triangles_count(G) 625 loops, best of 3: 444 µs per loop sage: %timeit my_triangles_countnx(G) 625 loops, best of 3: 353 µs per loop sage: sage: G = graphs.GridGraph([10,5]) sage: %timeit G.triangles_count() 25 loops, best of 3: 9.44 ms per loop sage: %timeit my_triangles_count(G) 125 loops, best of 3: 2.98 ms per loop sage: %timeit my_triangles_countnx(G) 125 loops, best of 3: 1.99 ms per loop sage: sage: G = graphs.RandomGNP(10,.1) sage: %timeit G.triangles_count() 625 loops, best of 3: 493 µs per loop sage: %timeit my_triangles_count(G) 625 loops, best of 3: 217 µs per loop sage: %timeit my_triangles_countnx(G) 625 loops, best of 3: 218 µs per loop sage: sage: G = graphs.RandomGNP(50,.1) sage: %timeit G.triangles_count() 25 loops, best of 3: 9.79 ms per loop sage: %timeit my_triangles_count(G) 125 loops, best of 3: 5.49 ms per loop sage: %timeit my_triangles_countnx(G) 125 loops, best of 3: 2.78 ms per loop sage: sage: G = graphs.RandomGNP(50,.5) sage: %timeit G.triangles_count() 25 loops, best of 3: 15 ms per loop sage: %timeit my_triangles_count(G) 5 loops, best of 3: 82.4 ms per loop sage: %timeit my_triangles_countnx(G) 25 loops, best of 3: 23.4 ms per loop sage: sage: G = graphs.RandomGNP(100,.5) sage: %timeit G.triangles_count() 5 loops, best of 3: 58.4 ms per loop sage: %timeit my_triangles_count(G) 5 loops, best of 3: 716 ms per loop sage: %timeit my_triangles_countnx(G) 5 loops, best of 3: 181 ms per loop sage: sage: G = graphs.RandomGNP(100,.1) sage: %timeit G.triangles_count() 25 loops, best of 3: 35.4 ms per loop sage: %timeit my_triangles_count(G) 25 loops, best of 3: 33.6 ms per loop sage: %timeit my_triangles_countnx(G) 25 loops, best of 3: 12.8 ms per loop sage: sage: G = graphs.RandomGNP(100,.05) sage: %timeit G.triangles_count() 25 loops, best of 3: 32.3 ms per loop sage: %timeit my_triangles_count(G) 25 loops, best of 3: 10.8 ms per loop sage: %timeit my_triangles_countnx(G) 125 loops, best of 3: 5.53 ms per loop
As you can see, when the graph is dense, the matrix multiplication method is the fastest, but for spare graphs it is better to use iterations and to convert the graph into a networkx graph. However, I don't know how to choose the threshold and I don't have better proposals in mind yet.
comment:6 Changed 8 years ago by
Hello,
I agree with what you said.
As you mentioned it is hard to deduce a threshold that is going to be good for all platforms (and times) so I suppose we could simply add an optional argument for the user to specify which algorithm to use. We could then also add the even faster triangle detection algorithms based on probability.
comment:7 Changed 8 years ago by
Yes, adding an argument to specify the algorithm is the best solution.
comment:8 Changed 8 years ago by
Also not to forget that the code for triangle detection would be much much faster if we compute A*A*A in the real field. Could you also check that? To compare with your other results?
comment:9 follow-up: ↓ 10 Changed 8 years ago by
It is true that A*A*A is a bit faster, but not in the real field where it seems very slow.
comment:10 in reply to: ↑ 9 Changed 8 years ago by
Replying to dcoudert:
It is true that A*A*A is a bit faster, but not in the real field where it seems very slow.
RR = RealField(53)
is arbitrary precision reals, so perhaps predictably slow, given the generality.
RDF
(real double field) is floating point doubles. Whatever the 53-bit precision IEEE format is. Try adjacency matrices over that ring. They might be an improvement.
comment:11 follow-up: ↓ 15 Changed 8 years ago by
Since the adjacency matrix is a matrix of integers (in fact booleans or bits), I expect matrix multiplication to be significantly faster than in any other rings.
comment:12 Changed 8 years ago by
HMmmm... I just read your tests, and did not get this one :
sage: %timeit G.is_triangle_free() 5 loops, best of 3: 1.1 s per loop sage: %timeit (M^3).trace() > 0 5 loops, best of 3: 1.04 s per loop sage: %timeit my_is_triangle_free(G) 5 loops, best of 3: 63.8 ms per loop
Could it be that there would be another difference in runtimes if one requires the adjacency matrix to be *dense* ? :-)
Nathann
comment:13 Changed 8 years ago by
If the graph is dense, the first visited vertex will have a triangle with high probability. So after a very small number of iterations and so operations, the algorithm returns False. But with matrix multiplications, you have to pay the full cost whatever the result.
Of course, if the graph is dense and triangle free (not so many such graphs), my algorithm will take long time, and possibly more than matrix multiplications, but in average it is faster. Do your own experiments to convince yourself.
So for testing if G is triangle free, it is good to have a parameter to choose the algorithm, and a default behavior like: if G has less than 15 nodes, then do matrix multiplications, else use other algo.
For counting triangles, the situation is different and we clearly have to take into account the density to choose the best algorithm.
comment:14 Changed 8 years ago by
Two interesting (worst case) examples:
sage: G = graphs.CompleteBipartiteGraph(100,100); M = G.adjacency_matrix() sage: G.density() 100/199 sage: %timeit not (M*M*M).trace() 5 loops, best of 3: 59.8 ms per loop sage: %timeit my_is_triangle_free(G) 5 loops, best of 3: 5.66 s per loop sage: sage: G = graphs.RandomGNP(200,100/199); M = G.adjacency_matrix() sage: G.density() 10009/19900 sage: %timeit not (M*M*M).trace() 5 loops, best of 3: 63.8 ms per loop sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 300 µs per loop
comment:15 in reply to: ↑ 11 Changed 8 years ago by
Replying to dcoudert:
Since the adjacency matrix is a matrix of integers (in fact booleans or bits), I expect matrix multiplication to be significantly faster than in any other rings.
I'd guess that managing (potentially) large integers has more overhead than managing doubles:
sage: G = graphs.CompleteBipartiteGraph(100, 100) sage: M = G.am() sage: timeit("M*M*M") 25 loops, best of 3: 31 ms per loop sage: MR = M.change_ring(RDF) sage: timeit("MR*MR*MR") 125 loops, best of 3: 2.4 ms per loop sage: (MR*MR*MR).trace() 0.0
I can't recall seeing a Sage class for integers of a bounded size.
comment:16 Changed 8 years ago by
I'm certainly not used to matrix multiplications and rings in sage, and I'm surprised that computing with doubles is faster than with integer. For instance, if G has 10^6
vertices, then since M is a matrix of 0 and 1's, the largest value in M^3
is in 10^18 < 2^63
and so all computations are doable using 64bits integers.
Following your example, and after some search, I found a much faster solution: use GF(2)
sage: G = graphs.CompleteBipartiteGraph(100,100); M = G.am(); M 200 x 200 dense matrix over Integer Ring (type 'print M.str()' to see all of the entries) sage: %timeit not (M*M*M).trace() 5 loops, best of 3: 39.5 ms per loop sage: MR = M.change_ring(RDF); MR 200 x 200 dense matrix over Real Double Field (type 'print MR.str()' to see all of the entries) sage: %timeit (MR*MR*MR).trace() == 0.0 125 loops, best of 3: 1.47 ms per loop sage: M2 = M.change_ring(GF(2)); M2 200 x 200 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries) sage: %timeit not (M2*M2*M2).trace() 625 loops, best of 3: 201 µs per loop
However, this is not always the best solution and it is extremely slow for sparse matrix O_o
sage: G = graphs.RandomGNP(1000,.1); M = G.am(); M2 = M.change_ring(GF(2)); M2 1000 x 1000 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries) sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 424 µs per loop sage: %timeit not (M2*M2*M2).trace() 125 loops, best of 3: 4.09 ms per loop sage: MR = M.change_ring(RDF) sage: %timeit (MR*MR*MR).trace() == 0.0 5 loops, best of 3: 96.9 ms per loop sage: sage: G = graphs.RandomGNP(1000,.01); M = G.am(); M2 = M.change_ring(GF(2)); M2 1000 x 1000 sparse matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries) sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 594 µs per loop sage: %timeit not (M2*M2*M2).trace() 5 loops, best of 3: 3.41 s per loop sage: MR = M.change_ring(RDF) 5 loops, best of 3: 8.08 s per loop sage: sage: G = graphs.RandomGNP(100,.01); M = G.am(); M2 = M.change_ring(GF(2)); M2 100 x 100 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries) sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 1.22 ms per loop sage: %timeit not (M2*M2*M2).trace() 625 loops, best of 3: 95.7 µs per loop sage: sage: G = graphs.RandomGNP(100,.1); M = G.am(); M2 = M.change_ring(GF(2)); M2 100 x 100 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries) sage: %timeit my_is_triangle_free(G) 625 loops, best of 3: 71.4 µs per loop sage: %timeit not (M2*M2*M2).trace() 625 loops, best of 3: 95.5 µs per loop
I don't know how to force a matrix to be dense. for "small" graphs, it is apparently always dense.
Clearly, the algorithm should use tradeoffs between size and density.
comment:17 Changed 8 years ago by
I don't know how to force a matrix to be dense.
mat.dense_matrix()
should do it.
comment:18 follow-up: ↓ 19 Changed 8 years ago by
@dcoudert
While matrix multiplication over GF(2) is quite fast, it is wrong for the adjacency matrix. The trace of the cube of an adjacency matrix is always zero over GF(2).
comment:19 in reply to: ↑ 18 Changed 8 years ago by
While matrix multiplication over GF(2) is quite fast, it is wrong for the adjacency matrix. The trace of the cube of an adjacency matrix is always zero over GF(2).
That's a pity :(
Any interesting alternatives? At least you can force matrices to be dense, but for small graphs (10 nodes), the extra cost might be non-negligeable.
comment:20 Changed 8 years ago by
"dense" in Sage is an implementation detail. Any matrix can be made dense or sparse, even if it is not a good idea for the algorithm at hand.
In the discussion here, the actual "density" of the adjacency matrix is relevant for which approach (matrix algebra, vertex neighborhoods) might be faster.
comment:21 follow-up: ↓ 22 Changed 8 years ago by
Right.
A reasonably fast alternative to matrix multiplication (could be improved):
def my_is_triangle_free_bitset(G): map = {} i = 0 for u in G.vertex_iterator(): map[u] = i i += 1 B = {} for u in G.vertex_iterator(): B[u] = Bitset([map[i] for i in G.neighbor_iterator(u)]) BB = {} for u in G.vertex_iterator(): BB[u] = Bitset() for v in G.vertex_iterator(): if B[u]&B[v]: BB[u].add(map[v]) return not any(B[u]&BB[u] for u in G.vertex_iterator()) def toto(G): M = G.am() return not (M*M*M).trace()
sage: G = graphs.CompleteBipartiteGraph(3,3) sage: %timeit toto(G) 625 loops, best of 3: 256 µs per loop sage: %timeit my_is_triangle_free_bitset(G) 625 loops, best of 3: 132 µs per loop sage: sage: G = graphs.CompleteBipartiteGraph(30,30) sage: %timeit toto(G) 25 loops, best of 3: 15 ms per loop sage: %timeit my_is_triangle_free_bitset(G) 125 loops, best of 3: 6.99 ms per loop
comment:22 in reply to: ↑ 21 Changed 8 years ago by
Replying to dcoudert:
A reasonably fast alternative to matrix multiplication (could be improved):
I like the looks of my_is_triangle_free_bitset
.
It is too bad more of these bit-by-bit type operations are not more easily available for graphs. I know the folks working on the matroid classes are implementing a lot of duplicate functionality to be faster in tese sorts of situations (like matrix operations).
Rob
comment:23 Changed 8 years ago by
Well, the difficulty with graph algorithms is that the efficiency depends on the data structure, and we have as many possibilities as algorithms. bit-by-bit operations are suitable for some kind of operations, but other representations are also helpful. Sometimes it is faster to turn the graph into a networkx graph (e.g. when adding/removing edges to test some property), but sometimes the extra cost of changing the graph structure is non negligible. I don't have magic solution but it's true that we could gather some sets of operations into dedicated graph structures as for instance the FastGraph? class Nathann has used for pathwidth.
This version is a bit faster
def my_is_triangle_free_bitset(G): N = G.num_verts() map = {} i = 0 B = {} for u in G.vertex_iterator(): map[u] = i i += 1 B[u] = Bitset(capacity=N) # map adjacency to bitsets for u,v in G.edge_iterator(labels=None): B[u].add(map[v]) B[v].add(map[u]) # map lengths 2 paths to bitsets BB = {} for u in G.vertex_iterator(): BB[u] = Bitset(capacity=N) for v in G.vertex_iterator(): if B[u]&B[v]: BB[u].add(map[v]) # search for triangles return not any(B[u]&BB[u] for u in G.vertex_iterator())
comment:24 Changed 8 years ago by
- Description modified (diff)
- Reviewers David Coudert deleted
- Status changed from needs_work to needs_review
I have included discussed improvements into the is_triangle_free
and the triangles_count
methods, and added optional argument for algorithm selection.
One could also add probabilistic test for triangle detection.
Could you benchmark the new functions to update the ticket description?
comment:25 follow-up: ↓ 26 Changed 8 years ago by
- Status changed from needs_review to needs_work
Hellooooooooo David !!
First, you do not know how glad I am to hear things like "the difficulty with graph algorithms is that the efficiency depends on the data structure" from the mouth of researchers. "Adjacency test ? O(1) !" is their usual answer. :-D
I have been thinking of reimplementing another version using FastGraph?, but that would be more trouble than necessary for the moment. And I could actually use the same technique to reimplement SubgraphSearch? a bit better anyway, so I guess I will go there directly.
About the patch :
- Where did you find that this "abs" was not needed ? I thought it was
O_o
- I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs?... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.
- "tests if the trace of the adjacency matrix is positive" would be counting the number of loops
:-D
Btw I guess it would be better to have "return (A * * 3 ).trace() == 0
" instead of "
return (A*A*A).trace() == 0
", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for
k=3
:-)
Thanks for that patch !!
nathann
comment:26 in reply to: ↑ 25 ; follow-up: ↓ 27 Changed 8 years ago by
- Where did you find that this "abs" was not needed ? I thought it was
O_o
It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.
- I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs?... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.
Jernej should answer that remark.
- "tests if the trace of the adjacency matrix is positive" would be counting the number of loops
:-D
Btw I guess it would be better to have "return (A * * 3 ).trace() == 0
" instead of "
return (A*A*A).trace() == 0
", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for
k=3
:-)
OK.
I have implemented some of the modifications.
Thanks for that patch !!
I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.
comment:27 in reply to: ↑ 26 ; follow-up: ↓ 28 Changed 8 years ago by
Replying to dcoudert:
- Where did you find that this "abs" was not needed ? I thought it was
O_o
It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.
abs is not needed. Perhaps we can still remove it from this patch and I add it later in a generic patch fixing various small details in graph.py. The reason it is not needed is that Kirchhoff matrix tree theorem states that the number of spanning trees equals
det(L')*(-1)(i+j}+)
where L' is the matrix that is obtained from the Laplacian after removing row i and column j.
- I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs?... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.
Jernej should answer that remark.
I am not really sure what to do here.
- "tests if the trace of the adjacency matrix is positive" would be counting the number of loops
:-D
Btw I guess it would be better to have "return (A * * 3 ).trace() == 0
" instead of "
return (A*A*A).trace() == 0
", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for
k=3
:-)
OK.
I have implemented some of the modifications.
Thanks for that patch !!
I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.
Haven't looked at the site yet, but is it hard to implement the algorithm directly into sage?
comment:28 in reply to: ↑ 27 Changed 8 years ago by
Replying to azi:
Replying to dcoudert:
- Where did you find that this "abs" was not needed ? I thought it was
O_o
It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.
abs is not needed. Perhaps we can still remove it from this patch and I add it later in a generic patch fixing various small details in graph.py. The reason it is not needed is that Kirchhoff matrix tree theorem states that the number of spanning trees equals
det(L')*(-1)(i+j}+)
where L' is the matrix that is obtained from the Laplacian after removing row i and column j.
yes, it is better to remove the abs in a patch dedicated to small details ;-)
I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.
Haven't looked at the site yet, but is it hard to implement the algorithm directly into sage?
It's difficult in python. The method is fast because it uses optimized data structure. It can be done in cython, but then it is certainly better to only implement the interfaces with the original c code of Latapy. We can easily obtained agreement from him for integrating is code in sage. This should be done in a dedicated patch to ease reviewing process.
comment:29 Changed 8 years ago by
- Status changed from needs_work to needs_review
comment:30 follow-up: ↓ 31 Changed 8 years ago by
Can someone review this patch?
comment:31 in reply to: ↑ 30 Changed 8 years ago by
Can someone review this patch?
Well. Why wouldn't you ? ^^;
It looks like David updated the patch last, so you can review his changes.. I mean, for as long as two persons look at the file together and agree that it is all good, we usually accept this as a review and put both names as Authors and Reviewers :-)
Nathann
comment:32 Changed 8 years ago by
Oh, didn't know I can do that too. Here we go...
I have tested the code briefly and it looks good to me as far as correction goes. I have two additional remarks though.
- Can we avoid using graphx? Why exactly are we using it? I know its because of performance but what exactly is making the algorithm tick that faster? It doesn’t seem the right thing (to me) to call a graphx object for this purpose but rather see where the problem in Sage is. Its weird that its that faster even when you take the wrapping overhead into account.
- The code in is_triangle_free currently looks as follows
BB = {} for u in self.vertex_iterator(): BB[u] = Bitset(capacity=N) for v in self.vertex_iterator(): if B[u]&B[v]: BB[u].add(map[v]) # search for triangles return not any(B[u]&BB[u] for u in self.vertex_iterator())
Am I missing something or this should be equivalent (to the slightly more optimal):
BB = {} for u in self.vertex_iterator(): BB[u] = Bitset(capacity=N) for v in self.vertex_iterator(): if B[u]&B[v]: BB[u].add(map[v]) if B[u] & BB[u] == 1: return False return True
comment:33 Changed 8 years ago by
- Status changed from needs_review to needs_work
Changed 8 years ago by
comment:34 Changed 8 years ago by
- Status changed from needs_work to needs_review
- networkx uses dictionaries to store edges. If G is a networkx graph, it is also a dictionary indexed by the vertices, and G[u] is a dictionary indexed by neighbors and containing edge data. This way, iterations are fast. I don't know exactly how are implemented sage graphs in the backends, and so I don't know how to speedup the basic iterations over the edges, the neighbors, etc. This is certainly needed.
- I have implemented the proposed modification in
is_triangle_free
.
comment:35 Changed 8 years ago by
Hello!
- I would suggest we make this into a new ticket or something - how to make graph structures more efficient , and leave the code as is for now?
What do you guys think?
comment:36 Changed 8 years ago by
I agree.
comment:37 Changed 8 years ago by
- Status changed from needs_review to positive_review
Okay. Could you (dcoudert) pleaase make another ticket that basically explains what you said under 1? We can then discuss this there.
I'll mark this ticket with a positive review now.
comment:38 Changed 8 years ago by
- Reviewers set to Jernej Azarija
comment:39 Changed 8 years ago by
- Merged in set to sage-5.6.beta1
- Resolution set to fixed
- Status changed from positive_review to closed
The patch needs a fix of the documentation before its ready for review.