Opened 9 years ago

Closed 9 years ago

#13503 closed enhancement (fixed)

Enhancement of `is_triangle_free' addition of `triangles_count' and a minor change in `spanning_trees_count'

Reported by: azi Owned by: jason, ncohen, rlm
Priority: minor Milestone: sage-5.6
Component: graph theory Keywords: triangles, graphs, number of triangles
Cc: rbeezer, ncohen Merged in: sage-5.6.beta1
Authors: Jernej Azarija, David Coudert Reviewers: Jernej Azarija
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by dcoudert)

The main change of this patch is to improve the `is_triangle_free' method. The introduced algorithm uses the fact that given that A is the adjacency matrix of a simple graph G, then the number of triangles in G is tr(A3)/6.

The change has been tested resulting in the following benchmarks:

  1. Testing all graphs of order up to 10 for triangles: (old: 201m44.339s, new: 87m38.146s)
  2. Testing all triangle-free graphs of order 12 for triangles: (old: 15m56.679s, new 9m41.116s)
  3. Testing all graphs of order up to 10 that do contain triangles: (old: 0m10.305s, new : 0m5.798s)

I can provide the test programs if needed.

Since there are many applications in which one needs to count the number of triangles in a graph the named routine was also added.

Finally a minor change has been made to the generic_graph function spanning_trees_count() which calls abs on a determinant that is always positive.

Apply:

Attachments (1)

trac_13503_triangles-v2.patch (7.6 KB) - added by dcoudert 9 years ago.

Download all attachments as: .zip

Change History (40)

comment:1 Changed 9 years ago by kini

  • Authors set to Jernej Azarija

comment:2 Changed 9 years ago by azi

  • Status changed from new to needs_review

comment:3 Changed 9 years ago by rbeezer

  • Cc rbeezer added

comment:4 Changed 9 years ago by azi

  • Status changed from needs_review to needs_work

The patch needs a fix of the documentation before its ready for review.

comment:5 Changed 9 years ago by dcoudert

  • Cc ncohen added
  • Reviewers set to David Coudert

Hello,

I added Nathann in Cc because we had a discussion about counting triangles last week ;)

The proposed method for testing if the graph is triangle free is not a good one. Indeed, the time complexity for testing if the graph is triangle free is at most Nw, where 2 <= w < 3 is the best possible exponent for computing the square of a matrix of side N. You compute the square of the adjacency matrix to get all paths of length 2, and then you test for every edge (u,v) if there is a path of length 2 from u to v. Furthermore, the basic algorithm of testing if some neighbors of a vertex are incident is fast.

Let us compare the following method with your proposition for testing if the graph is triangle free.

def my_is_triangle_free(G):
    for u in G.vertex_iterator():
        if any(G.has_edge(v,w) for v,w in combinations_iterator(G.neighbors(u),2)):
            return False
    #
    return True

We get the following running times:

sage: G = graphs.GridGraph([5,2]); M = G.adjacency_matrix() # 10 nodes
sage: %timeit G.is_triangle_free()
625 loops, best of 3: 603 µs per loop
sage: %timeit (M^3).trace() > 0
625 loops, best of 3: 101 µs per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 376 µs per loop
sage:
sage: G = graphs.GridGraph([10,5]); M = G.adjacency_matrix() # 50 nodes
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 9.49 ms per loop
sage: %timeit (M^3).trace() > 0
125 loops, best of 3: 3.92 ms per loop
sage: %timeit my_is_triangle_free(G)
125 loops, best of 3: 2.67 ms per loop
sage:
sage: G = graphs.GridGraph([10,10]); M = G.adjacency_matrix() # 100 nodes
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 31.5 ms per loop
sage: %timeit (M^3).trace() > 0
25 loops, best of 3: 12.3 ms per loop
sage: %timeit my_is_triangle_free(G)
125 loops, best of 3: 5.71 ms per loop
sage: 
sage: G = graphs.GridGraph([50,20]); M = G.adjacency_matrix() # 1000 nodes
sage: %timeit G.is_triangle_free()
5 loops, best of 3: 1.1 s per loop
sage: %timeit (M^3).trace() > 0
5 loops, best of 3: 1.04 s per loop
sage: %timeit my_is_triangle_free(G)
5 loops, best of 3: 63.8 ms per loop
sage:
sage: G = graphs.RandomGNP(10,.1); M = G.adjacency_matrix()  # 10 nodes
sage: %timeit G.is_triangle_free()
625 loops, best of 3: 492 µs per loop
sage: %timeit (M^3).trace() > 0
625 loops, best of 3: 90.5 µs per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 160 µs per loop
sage: 
sage: G = graphs.RandomGNP(50,.1); M = G.adjacency_matrix()  # 50 nodes
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 9.92 ms per loop
sage: %timeit (M^3).trace() > 0
125 loops, best of 3: 3.93 ms per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 110 µs per loop
sage:
sage: G = graphs.RandomGNP(100,.2); M = G.adjacency_matrix() # 100 nodes, dense
sage: %timeit G.is_triangle_free()
5 loops, best of 3: 40 ms per loop
sage: %timeit (M^3).trace() > 0
25 loops, best of 3: 12.9 ms per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 125 µs per loop
sage:
sage: G = graphs.RandomGNP(100,.01); M = G.adjacency_matrix()  # 100 nodes, sparse
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 29.7 ms per loop
sage: %timeit (M^3).trace() > 0
25 loops, best of 3: 12.2 ms per loop
sage: %timeit my_is_triangle_free(G)
125 loops, best of 3: 2.05 ms per loop
sage: 
sage: G = graphs.RandomGNP(1000,.01); M = G.adjacency_matrix() # 1000 nodes
sage: %timeit G.is_triangle_free()
5 loops, best of 3: 33.9 s per loop
sage: %timeit (M^3).trace() > 0
5 loops, best of 3: 33.8 s per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 1.24 ms per loop

For very small graphs it is apparently faster to do (M^3).trace() > 0, but for large graphs, basic iterations are way faster because we stop computations as soon as a triangle is found (if any). Therefore, a better solution could be an hybrid method using matrix multiplication if the graph is very small, and basic iterations otherwise.


For counting triangles, the situation is rather different and the density of the graph matters. Furthermore, it is interesting to use networkx graphs:

def my_triangles_count(G):
    tr = 0
    for u in G.vertex_iterator():
        tr += sum(G.has_edge(v,w) for v,w in combinations_iterator(G.neighbors(u),2))
    return tr/3

def my_triangles_countnx(G):
    tr = 0
    ggnx = G.networkx_graph()
    for u in ggnx.nodes_iter():
        tr += sum(ggnx.has_edge(v,w) for v,w in combinations_iterator(ggnx.neighbors(u),2))
    return tr/3

We get the following running time:

sage: G = graphs.GridGraph([5,2])
sage: %timeit G.triangles_count()
625 loops, best of 3: 608 µs per loop
sage: %timeit my_triangles_count(G)
625 loops, best of 3: 444 µs per loop
sage: %timeit my_triangles_countnx(G)
625 loops, best of 3: 353 µs per loop
sage:
sage: G = graphs.GridGraph([10,5])
sage: %timeit G.triangles_count()
25 loops, best of 3: 9.44 ms per loop
sage: %timeit my_triangles_count(G)
125 loops, best of 3: 2.98 ms per loop
sage: %timeit my_triangles_countnx(G)
125 loops, best of 3: 1.99 ms per loop
sage:
sage: G = graphs.RandomGNP(10,.1)
sage: %timeit G.triangles_count()
625 loops, best of 3: 493 µs per loop
sage: %timeit my_triangles_count(G)
625 loops, best of 3: 217 µs per loop
sage: %timeit my_triangles_countnx(G)
625 loops, best of 3: 218 µs per loop
sage:
sage: G = graphs.RandomGNP(50,.1)
sage: %timeit G.triangles_count()
25 loops, best of 3: 9.79 ms per loop
sage: %timeit my_triangles_count(G)
125 loops, best of 3: 5.49 ms per loop
sage: %timeit my_triangles_countnx(G)
125 loops, best of 3: 2.78 ms per loop
sage:
sage: G = graphs.RandomGNP(50,.5)
sage: %timeit G.triangles_count()
25 loops, best of 3: 15 ms per loop
sage: %timeit my_triangles_count(G)
5 loops, best of 3: 82.4 ms per loop
sage: %timeit my_triangles_countnx(G)
25 loops, best of 3: 23.4 ms per loop
sage:
sage: G = graphs.RandomGNP(100,.5)
sage: %timeit G.triangles_count()
5 loops, best of 3: 58.4 ms per loop
sage: %timeit my_triangles_count(G)
5 loops, best of 3: 716 ms per loop
sage: %timeit my_triangles_countnx(G)
5 loops, best of 3: 181 ms per loop
sage:
sage: G = graphs.RandomGNP(100,.1)
sage: %timeit G.triangles_count()
25 loops, best of 3: 35.4 ms per loop
sage: %timeit my_triangles_count(G)
25 loops, best of 3: 33.6 ms per loop
sage: %timeit my_triangles_countnx(G)
25 loops, best of 3: 12.8 ms per loop
sage:
sage: G = graphs.RandomGNP(100,.05)
sage: %timeit G.triangles_count()
25 loops, best of 3: 32.3 ms per loop
sage: %timeit my_triangles_count(G)
25 loops, best of 3: 10.8 ms per loop
sage: %timeit my_triangles_countnx(G)
125 loops, best of 3: 5.53 ms per loop

As you can see, when the graph is dense, the matrix multiplication method is the fastest, but for spare graphs it is better to use iterations and to convert the graph into a networkx graph. However, I don't know how to choose the threshold and I don't have better proposals in mind yet.

comment:6 Changed 9 years ago by azi

Hello,

I agree with what you said.

As you mentioned it is hard to deduce a threshold that is going to be good for all platforms (and times) so I suppose we could simply add an optional argument for the user to specify which algorithm to use. We could then also add the even faster triangle detection algorithms based on probability.

comment:7 Changed 9 years ago by dcoudert

Yes, adding an argument to specify the algorithm is the best solution.

comment:8 Changed 9 years ago by azi

Also not to forget that the code for triangle detection would be much much faster if we compute A*A*A in the real field. Could you also check that? To compare with your other results?

comment:9 follow-up: Changed 9 years ago by dcoudert

It is true that A*A*A is a bit faster, but not in the real field where it seems very slow.

comment:10 in reply to: ↑ 9 Changed 9 years ago by rbeezer

Replying to dcoudert:

It is true that A*A*A is a bit faster, but not in the real field where it seems very slow.

RR = RealField(53) is arbitrary precision reals, so perhaps predictably slow, given the generality.

RDF (real double field) is floating point doubles. Whatever the 53-bit precision IEEE format is. Try adjacency matrices over that ring. They might be an improvement.

comment:11 follow-up: Changed 9 years ago by dcoudert

Since the adjacency matrix is a matrix of integers (in fact booleans or bits), I expect matrix multiplication to be significantly faster than in any other rings.

comment:12 Changed 9 years ago by ncohen

HMmmm... I just read your tests, and did not get this one :

sage: %timeit G.is_triangle_free()
5 loops, best of 3: 1.1 s per loop
sage: %timeit (M^3).trace() > 0
5 loops, best of 3: 1.04 s per loop
sage: %timeit my_is_triangle_free(G)
5 loops, best of 3: 63.8 ms per loop

Could it be that there would be another difference in runtimes if one requires the adjacency matrix to be *dense* ? :-)

Nathann

comment:13 Changed 9 years ago by dcoudert

If the graph is dense, the first visited vertex will have a triangle with high probability. So after a very small number of iterations and so operations, the algorithm returns False. But with matrix multiplications, you have to pay the full cost whatever the result.

Of course, if the graph is dense and triangle free (not so many such graphs), my algorithm will take long time, and possibly more than matrix multiplications, but in average it is faster. Do your own experiments to convince yourself.

So for testing if G is triangle free, it is good to have a parameter to choose the algorithm, and a default behavior like: if G has less than 15 nodes, then do matrix multiplications, else use other algo.

For counting triangles, the situation is different and we clearly have to take into account the density to choose the best algorithm.

comment:14 Changed 9 years ago by dcoudert

Two interesting (worst case) examples:

sage: G = graphs.CompleteBipartiteGraph(100,100); M = G.adjacency_matrix()
sage: G.density()
100/199
sage: %timeit not (M*M*M).trace()
5 loops, best of 3: 59.8 ms per loop
sage: %timeit my_is_triangle_free(G)
5 loops, best of 3: 5.66 s per loop
sage:
sage: G = graphs.RandomGNP(200,100/199); M = G.adjacency_matrix()
sage: G.density()
10009/19900
sage: %timeit not (M*M*M).trace()
5 loops, best of 3: 63.8 ms per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 300 µs per loop

comment:15 in reply to: ↑ 11 Changed 9 years ago by rbeezer

Replying to dcoudert:

Since the adjacency matrix is a matrix of integers (in fact booleans or bits), I expect matrix multiplication to be significantly faster than in any other rings.

I'd guess that managing (potentially) large integers has more overhead than managing doubles:

sage: G = graphs.CompleteBipartiteGraph(100, 100)
sage: M = G.am()
sage: timeit("M*M*M")
25 loops, best of 3: 31 ms per loop
sage: MR = M.change_ring(RDF)
sage: timeit("MR*MR*MR")     
125 loops, best of 3: 2.4 ms per loop
sage: (MR*MR*MR).trace()
0.0

I can't recall seeing a Sage class for integers of a bounded size.

comment:16 Changed 9 years ago by dcoudert

I'm certainly not used to matrix multiplications and rings in sage, and I'm surprised that computing with doubles is faster than with integer. For instance, if G has 10^6 vertices, then since M is a matrix of 0 and 1's, the largest value in M^3 is in 10^18 < 2^63 and so all computations are doable using 64bits integers.

Following your example, and after some search, I found a much faster solution: use GF(2)

sage: G = graphs.CompleteBipartiteGraph(100,100); M = G.am(); M
200 x 200 dense matrix over Integer Ring (type 'print M.str()' to see all of the entries)
sage: %timeit not (M*M*M).trace()
5 loops, best of 3: 39.5 ms per loop
sage: MR = M.change_ring(RDF); MR
200 x 200 dense matrix over Real Double Field (type 'print MR.str()' to see all of the entries)
sage: %timeit (MR*MR*MR).trace() == 0.0
125 loops, best of 3: 1.47 ms per loop
sage: M2 = M.change_ring(GF(2)); M2
200 x 200 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit not (M2*M2*M2).trace()
625 loops, best of 3: 201 µs per loop

However, this is not always the best solution and it is extremely slow for sparse matrix O_o

sage: G = graphs.RandomGNP(1000,.1); M = G.am(); M2 = M.change_ring(GF(2)); M2
1000 x 1000 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 424 µs per loop
sage: %timeit not (M2*M2*M2).trace()
125 loops, best of 3: 4.09 ms per loop
sage: MR = M.change_ring(RDF)
sage: %timeit (MR*MR*MR).trace() == 0.0
5 loops, best of 3: 96.9 ms per loop
sage:
sage: G = graphs.RandomGNP(1000,.01); M = G.am(); M2 = M.change_ring(GF(2)); M2
1000 x 1000 sparse matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 594 µs per loop
sage: %timeit not (M2*M2*M2).trace()
5 loops, best of 3: 3.41 s per loop
sage: MR = M.change_ring(RDF)
5 loops, best of 3: 8.08 s per loop
sage:
sage: G = graphs.RandomGNP(100,.01); M = G.am(); M2 = M.change_ring(GF(2)); M2
100 x 100 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 1.22 ms per loop
sage: %timeit not (M2*M2*M2).trace()
625 loops, best of 3: 95.7 µs per loop
sage:
sage: G = graphs.RandomGNP(100,.1); M = G.am(); M2 = M.change_ring(GF(2)); M2
100 x 100 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 71.4 µs per loop
sage: %timeit not (M2*M2*M2).trace()
625 loops, best of 3: 95.5 µs per loop

I don't know how to force a matrix to be dense. for "small" graphs, it is apparently always dense.

Clearly, the algorithm should use tradeoffs between size and density.

comment:17 Changed 9 years ago by jhpalmieri

I don't know how to force a matrix to be dense.

mat.dense_matrix() should do it.

comment:18 follow-up: Changed 9 years ago by azi

@dcoudert

While matrix multiplication over GF(2) is quite fast, it is wrong for the adjacency matrix. The trace of the cube of an adjacency matrix is always zero over GF(2).

comment:19 in reply to: ↑ 18 Changed 9 years ago by dcoudert

While matrix multiplication over GF(2) is quite fast, it is wrong for the adjacency matrix. The trace of the cube of an adjacency matrix is always zero over GF(2).

That's a pity :(

Any interesting alternatives? At least you can force matrices to be dense, but for small graphs (10 nodes), the extra cost might be non-negligeable.

comment:20 Changed 9 years ago by rbeezer

"dense" in Sage is an implementation detail. Any matrix can be made dense or sparse, even if it is not a good idea for the algorithm at hand.

In the discussion here, the actual "density" of the adjacency matrix is relevant for which approach (matrix algebra, vertex neighborhoods) might be faster.

comment:21 follow-up: Changed 9 years ago by dcoudert

Right.

A reasonably fast alternative to matrix multiplication (could be improved):

def my_is_triangle_free_bitset(G):
    map = {}
    i = 0
    for u in G.vertex_iterator():
        map[u] = i
        i += 1
    B = {}
    for u in G.vertex_iterator():
        B[u] = Bitset([map[i] for i in G.neighbor_iterator(u)])
    BB = {}
    for u in G.vertex_iterator():
        BB[u] = Bitset()
        for v in G.vertex_iterator():
            if B[u]&B[v]:
                BB[u].add(map[v])
    return not any(B[u]&BB[u] for u in G.vertex_iterator())

def toto(G):
    M = G.am()
    return not (M*M*M).trace()
sage: G = graphs.CompleteBipartiteGraph(3,3)
sage: %timeit toto(G)                
625 loops, best of 3: 256 µs per loop
sage: %timeit my_is_triangle_free_bitset(G)
625 loops, best of 3: 132 µs per loop
sage:
sage: G = graphs.CompleteBipartiteGraph(30,30)
sage: %timeit toto(G)
25 loops, best of 3: 15 ms per loop
sage: %timeit my_is_triangle_free_bitset(G)
125 loops, best of 3: 6.99 ms per loop

comment:22 in reply to: ↑ 21 Changed 9 years ago by rbeezer

Replying to dcoudert:

A reasonably fast alternative to matrix multiplication (could be improved):

I like the looks of my_is_triangle_free_bitset.

It is too bad more of these bit-by-bit type operations are not more easily available for graphs. I know the folks working on the matroid classes are implementing a lot of duplicate functionality to be faster in tese sorts of situations (like matrix operations).

Rob

comment:23 Changed 9 years ago by dcoudert

Well, the difficulty with graph algorithms is that the efficiency depends on the data structure, and we have as many possibilities as algorithms. bit-by-bit operations are suitable for some kind of operations, but other representations are also helpful. Sometimes it is faster to turn the graph into a networkx graph (e.g. when adding/removing edges to test some property), but sometimes the extra cost of changing the graph structure is non negligible. I don't have magic solution but it's true that we could gather some sets of operations into dedicated graph structures as for instance the FastGraph? class Nathann has used for pathwidth.

This version is a bit faster

def my_is_triangle_free_bitset(G):
    N = G.num_verts()
    map = {}
    i = 0
    B = {}
    for u in G.vertex_iterator():
        map[u] = i
        i += 1
        B[u] = Bitset(capacity=N)
    # map adjacency to bitsets
    for u,v in G.edge_iterator(labels=None):
        B[u].add(map[v])
        B[v].add(map[u])
    # map lengths 2 paths to bitsets
    BB = {}
    for u in G.vertex_iterator():
        BB[u] = Bitset(capacity=N)
        for v in G.vertex_iterator():
            if B[u]&B[v]:
                BB[u].add(map[v])
    # search for triangles
    return not any(B[u]&BB[u] for u in G.vertex_iterator())

comment:24 Changed 9 years ago by dcoudert

  • Authors changed from Jernej Azarija to Jernej Azarija, David Coudert
  • Description modified (diff)
  • Reviewers David Coudert deleted
  • Status changed from needs_work to needs_review

I have included discussed improvements into the is_triangle_free and the triangles_count methods, and added optional argument for algorithm selection. One could also add probabilistic test for triangle detection.

Could you benchmark the new functions to update the ticket description?

comment:25 follow-up: Changed 9 years ago by ncohen

  • Status changed from needs_review to needs_work

Hellooooooooo David !!

First, you do not know how glad I am to hear things like "the difficulty with graph algorithms is that the efficiency depends on the data structure" from the mouth of researchers. "Adjacency test ? O(1) !" is their usual answer. :-D

I have been thinking of reimplementing another version using FastGraph?, but that would be more trouble than necessary for the moment. And I could actually use the same technique to reimplement SubgraphSearch? a bit better anyway, so I guess I will go there directly.

About the patch :

  • Where did you find that this "abs" was not needed ? I thought it was O_o
  • I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs?... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.
  • "tests if the trace of the adjacency matrix is positive" would be counting the number of loops :-D Btw I guess it would be better to have "return (A * * 3 ).trace() == 0" instead of "return (A*A*A).trace() == 0", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for k=3 :-)

Thanks for that patch !!

nathann

comment:26 in reply to: ↑ 25 ; follow-up: Changed 9 years ago by dcoudert

  • Where did you find that this "abs" was not needed ? I thought it was O_o

It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.

  • I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs?... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.

Jernej should answer that remark.

  • "tests if the trace of the adjacency matrix is positive" would be counting the number of loops :-D Btw I guess it would be better to have "return (A * * 3 ).trace() == 0" instead of "return (A*A*A).trace() == 0", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for k=3 :-)

OK.

I have implemented some of the modifications.

Thanks for that patch !!

I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.

comment:27 in reply to: ↑ 26 ; follow-up: Changed 9 years ago by azi

Replying to dcoudert:

  • Where did you find that this "abs" was not needed ? I thought it was O_o

It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.

abs is not needed. Perhaps we can still remove it from this patch and I add it later in a generic patch fixing various small details in graph.py. The reason it is not needed is that Kirchhoff matrix tree theorem states that the number of spanning trees equals

det(L')*(-1)(i+j}+)

where L' is the matrix that is obtained from the Laplacian after removing row i and column j.

  • I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs?... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.

Jernej should answer that remark.

I am not really sure what to do here.

  • "tests if the trace of the adjacency matrix is positive" would be counting the number of loops :-D Btw I guess it would be better to have "return (A * * 3 ).trace() == 0" instead of "return (A*A*A).trace() == 0", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for k=3 :-)

OK.

I have implemented some of the modifications.

Thanks for that patch !!

I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.

Haven't looked at the site yet, but is it hard to implement the algorithm directly into sage?

comment:28 in reply to: ↑ 27 Changed 9 years ago by dcoudert

Replying to azi:

Replying to dcoudert:

  • Where did you find that this "abs" was not needed ? I thought it was O_o

It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.

abs is not needed. Perhaps we can still remove it from this patch and I add it later in a generic patch fixing various small details in graph.py. The reason it is not needed is that Kirchhoff matrix tree theorem states that the number of spanning trees equals

det(L')*(-1)(i+j}+)

where L' is the matrix that is obtained from the Laplacian after removing row i and column j.

yes, it is better to remove the abs in a patch dedicated to small details ;-)

I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.

Haven't looked at the site yet, but is it hard to implement the algorithm directly into sage?

It's difficult in python. The method is fast because it uses optimized data structure. It can be done in cython, but then it is certainly better to only implement the interfaces with the original c code of Latapy. We can easily obtained agreement from him for integrating is code in sage. This should be done in a dedicated patch to ease reviewing process.

comment:29 Changed 9 years ago by dcoudert

  • Status changed from needs_work to needs_review

comment:30 follow-up: Changed 9 years ago by azi

Can someone review this patch?

comment:31 in reply to: ↑ 30 Changed 9 years ago by ncohen

Can someone review this patch?

Well. Why wouldn't you ? ^^;

It looks like David updated the patch last, so you can review his changes.. I mean, for as long as two persons look at the file together and agree that it is all good, we usually accept this as a review and put both names as Authors and Reviewers :-)

Nathann

comment:32 Changed 9 years ago by azi

Oh, didn't know I can do that too. Here we go...

I have tested the code briefly and it looks good to me as far as correction goes. I have two additional remarks though.

  1. Can we avoid using graphx? Why exactly are we using it? I know its because of performance but what exactly is making the algorithm tick that faster? It doesn’t seem the right thing (to me) to call a graphx object for this purpose but rather see where the problem in Sage is. Its weird that its that faster even when you take the wrapping overhead into account.
  1. The code in is_triangle_free currently looks as follows
            BB = {} 
            for u in self.vertex_iterator():
                BB[u] = Bitset(capacity=N)
                for v in self.vertex_iterator():
                    if B[u]&B[v]:
                        BB[u].add(map[v])

            # search for triangles
            return not any(B[u]&BB[u] for u in self.vertex_iterator())

Am I missing something or this should be equivalent (to the slightly more optimal):

            BB = {} 
            for u in self.vertex_iterator():
                BB[u] = Bitset(capacity=N)
                for v in self.vertex_iterator():
                    if B[u]&B[v]:
                        BB[u].add(map[v])
                if B[u] & BB[u] == 1:
                   return False

            return True

comment:33 Changed 9 years ago by azi

  • Status changed from needs_review to needs_work

Changed 9 years ago by dcoudert

comment:34 Changed 9 years ago by dcoudert

  • Status changed from needs_work to needs_review
  1. networkx uses dictionaries to store edges. If G is a networkx graph, it is also a dictionary indexed by the vertices, and G[u] is a dictionary indexed by neighbors and containing edge data. This way, iterations are fast. I don't know exactly how are implemented sage graphs in the backends, and so I don't know how to speedup the basic iterations over the edges, the neighbors, etc. This is certainly needed.
  1. I have implemented the proposed modification in is_triangle_free.

comment:35 Changed 9 years ago by azi

Hello!

  1. I would suggest we make this into a new ticket or something - how to make graph structures more efficient , and leave the code as is for now?

What do you guys think?

comment:36 Changed 9 years ago by dcoudert

I agree.

comment:37 Changed 9 years ago by azi

  • Status changed from needs_review to positive_review

Okay. Could you (dcoudert) pleaase make another ticket that basically explains what you said under 1? We can then discuss this there.

I'll mark this ticket with a positive review now.

Last edited 9 years ago by azi (previous) (diff)

comment:38 Changed 9 years ago by jdemeyer

  • Reviewers set to Jernej Azarija

comment:39 Changed 9 years ago by jdemeyer

  • Merged in set to sage-5.6.beta1
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.