Opened 5 years ago
Closed 5 years ago
#18834 closed enhancement (fixed)
Use Sage to compute clustering coefficient
Reported by:  ncohen  Owned by:  

Priority:  major  Milestone:  sage6.8 
Component:  graph theory  Keywords:  
Cc:  borassi, dcoudert  Merged in:  
Authors:  Nathann Cohen  Reviewers:  David Coudert 
Report Upstream:  N/A  Work issues:  
Branch:  8c25063 (Commits)  Commit:  8c25063de223a611c202d937d828c78cf0db8bce 
Dependencies:  #18811  Stopgaps: 
Description (last modified by )
Will have to be rebased+adapted over #18811.
sage: g = graphs.CompleteGraph(500) sage: %time _=g.clustering_coeff(implementation="boost") CPU times: user 7.62 s, sys: 8 ms, total: 7.62 s Wall time: 7.63 s sage: %time _=g.clustering_coeff() CPU times: user 340 ms, sys: 0 ns, total: 340 ms Wall time: 328 ms sage: %time _=g.clustering_coeff(implementation="dense_copy") CPU times: user 56 ms, sys: 4 ms, total: 60 ms Wall time: 61.3 ms
Change History (21)
comment:1 Changed 5 years ago by
 Branch set to u/ncohen/18834
comment:2 Changed 5 years ago by
 Commit set to 5a40d920d6a8c6585424329958b8b54a30c76275
comment:3 Changed 5 years ago by
 Status changed from new to needs_review
comment:4 Changed 5 years ago by
 Commit changed from 5a40d920d6a8c6585424329958b8b54a30c76275 to dba89c2c25a22a1dd66e53abde9ae8e619723556
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
283c61b  First draft

18bd01a  Applied Nathann's suggestions

99bd994  Few corrections in the reference manual

40df844  trac #18811: Review

6c8d49b  More reviewer comments

dba89c2  trac #18834: Use Sage to compute clustering coefficient

comment:5 Changed 5 years ago by
 Description modified (diff)
comment:6 Changed 5 years ago by
 Commit changed from dba89c2c25a22a1dd66e53abde9ae8e619723556 to e25d6a4407c2c8cb5092fceed80ab86146134327
comment:7 Changed 5 years ago by
rebased on top of #18811.
comment:8 Changed 5 years ago by
For me the patch is good to go (and the boost method is pretty fast in fact). However, the patch could be rebased on beta7 to solve the patchbot issue.
comment:9 Changed 5 years ago by
 Commit changed from e25d6a4407c2c8cb5092fceed80ab86146134327 to 6274bff716b13f4ef7b3e1745d8dcbf761336826
Branch pushed to git repo; I updated commit sha1. New commits:
6274bff  trac #18834: Merged with 6.8.beta7

comment:10 Changed 5 years ago by
Done! But what do you mean by 'boost is pretty fast'?? Sage is 100x faster O_o
comment:11 followup: ↓ 13 Changed 5 years ago by
On a clique that's for sure, but on other kinds of graphs...
sage: g = graphs.RandomBarabasiAlbert(1000,2) sage: %timeit _=g.clustering_coeff(implementation="boost") 1000 loops, best of 3: 1.9 ms per loop sage: %timeit _=g.clustering_coeff(implementation="dense_copy") 100 loops, best of 3: 7.34 ms per loop sage: %timeit _=g.clustering_coeff(implementation="sparse_copy") 100 loops, best of 3: 13.9 ms per loop sage: %timeit _=g.clustering_coeff(implementation="networkx") 100 loops, best of 3: 19.3 ms per loop sage: g = graphs.RandomBarabasiAlbert(10000,2) sage: %timeit _=g.clustering_coeff(implementation="boost") 10 loops, best of 3: 38 ms per loop sage: %timeit _=g.clustering_coeff(implementation="dense_copy") 10 loops, best of 3: 143 ms per loop sage: %timeit _=g.clustering_coeff(implementation="sparse_copy") 10 loops, best of 3: 148 ms per loop sage: %timeit _=g.clustering_coeff(implementation="networkx") 1 loops, best of 3: 248 ms per loop
comment:12 Changed 5 years ago by
Also, I have an issue:
sage: g = graphs.RandomGNP(1000,.1) sage: N = 1000 sage: g = graphs.RandomGNP(N, log(N)/N) sage: g.order(), g.size() (1000, 3567) sage: %timeit _=g.clustering_coeff(implementation="boost") 100 loops, best of 3: 2.78 ms per loop sage: %timeit _=g.clustering_coeff(implementation="dense_copy")  ZeroDivisionError Traceback (most recent call last) <ipythoninput31cd519bf674aa> in <module>() > 1 get_ipython().magic(u'timeit _=g.clustering_coeff(implementation="dense_copy")') /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/interactiveshell.pyc in magic(self, arg_s) 2305 magic_name, _, magic_arg_s = arg_s.partition(' ') 2306 magic_name = magic_name.lstrip(prefilter.ESC_MAGIC) > 2307 return self.run_line_magic(magic_name, magic_arg_s) 2308 2309 # /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line) 2226 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals 2227 with self.builtin_trap: > 2228 result = fn(*args,**kwargs) 2229 return result 2230 /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magics/execution.pyc in timeit(self, line, cell) /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magic.pyc in <lambda>(f, *a, **k) 191 # but it's overkill for just that one bit of state. 192 def magic_deco(arg): > 193 call = lambda f, *a, **k: f(*a, **k) 194 195 if callable(arg): /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magics/execution.pyc in timeit(self, line, cell) 1034 number = 1 1035 for _ in range(1, 10): > 1036 time_number = timer.timeit(number) 1037 worst_tuning = max(worst_tuning, time_number / number) 1038 if time_number >= 0.2: /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magics/execution.pyc in timeit(self, number) 130 gc.disable() 131 try: > 132 timing = self.inner(it, self.timer) 133 finally: 134 if gcold: <magictimeit> in inner(_it, _timer) /Users/dcoudert/sage/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.pyc in clustering_coeff(self, nodes, weight, implementation, return_vertex_weights) 12437 from sage.rings.integer import Integer 12438 return {v:Integer(count)/((self.degree(v)*(self.degree(v)1))/2) > 12439 for v,count in triangles_count(self).iteritems()} 12440 12441 def cluster_transitivity(self): /Users/dcoudert/sage/local/lib/python2.7/sitepackages/sage/graphs/generic_graph.pyc in <dictcomp>((v, count)) 12437 from sage.rings.integer import Integer 12438 return {v:Integer(count)/((self.degree(v)*(self.degree(v)1))/2) > 12439 for v,count in triangles_count(self).iteritems()} 12440 12441 def cluster_transitivity(self): /Users/dcoudert/sage/src/sage/structure/element.pyx in sage.structure.element.RingElement.__div__ (/Users/dcoudert/sage/src/build/cythonized/sage/structure/element.c:17310)() 1991 if have_same_parent_c(self, right): 1992 return (<RingElement>self)._div_(<RingElement>right) > 1993 return coercion_model.bin_op(self, right, div) 1994 1995 cpdef RingElement _div_(self, RingElement right): /Users/dcoudert/sage/src/sage/structure/coerce.pyx in sage.structure.coerce.CoercionModel_cache_maps.bin_op (/Users/dcoudert/sage/src/build/cythonized/sage/structure/coerce.c:8195)() 994 try: 995 xy = self.canonical_coercion(x,y) > 996 return PyObject_CallObject(op, xy) 997 except TypeError as err: 998 if xy is not None: /Users/dcoudert/sage/src/sage/structure/element.pyx in sage.structure.element.RingElement.__div__ (/Users/dcoudert/sage/src/build/cythonized/sage/structure/element.c:17293)() 1990 """ 1991 if have_same_parent_c(self, right): > 1992 return (<RingElement>self)._div_(<RingElement>right) 1993 return coercion_model.bin_op(self, right, div) 1994 /Users/dcoudert/sage/src/sage/rings/integer.pyx in sage.rings.integer.Integer._div_ (/Users/dcoudert/sage/src/build/cythonized/sage/rings/integer.c:11621)() 1728 # This is vastly faster than doing it here, since here 1729 # we can't cimport rationals. > 1730 return the_integer_ring._div(self, right) 1731 1732 def __floordiv__(x, y): /Users/dcoudert/sage/src/sage/rings/integer_ring.pyx in sage.rings.integer_ring.IntegerRing_class._div (/Users/dcoudert/sage/src/build/cythonized/sage/rings/integer_ring.c:4894)() 420 cdef rational.Rational x = rational.Rational.__new__(rational.Rational) 421 if mpz_sgn(right.value) == 0: > 422 raise ZeroDivisionError('Rational division by zero') 423 mpz_set(mpq_numref(x.value), left.value) 424 mpz_set(mpq_denref(x.value), right.value) ZeroDivisionError: Rational division by zero
I'll send you the graph by mail if you want to investigate.
comment:13 in reply to: ↑ 11 Changed 5 years ago by
On a clique that's for sure, but on other kinds of graphs...
Oh.. Wow..
HMmmm.. It seems that those graphs are stuffed with degree2 vertices. I wonder how they do that O_o
Nathann
comment:14 Changed 5 years ago by
I assume they iterate over the edges, which is what I would do on a sparse graph.
comment:15 Changed 5 years ago by
They probably do a dichotomic search where I do a linear search. I'll try to see tomorrow evening (I will sleep in an airport) if I can do something about that.
comment:16 Changed 5 years ago by
 Commit changed from 6274bff716b13f4ef7b3e1745d8dcbf761336826 to a5b7092b804e504c0dd5487bc368cf9d54082e4e
Branch pushed to git repo; I updated commit sha1. New commits:
a5b7092  trac #18834: Division by zero

comment:17 Changed 5 years ago by
 Commit changed from a5b7092b804e504c0dd5487bc368cf9d54082e4e to 8c25063de223a611c202d937d828c78cf0db8bce
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
8c25063  trac #18834: Division by zero

comment:18 Changed 5 years ago by
A more correct fix.
About the difference in timings:
sage: g=graphs.RandomBarabasiAlbert(10000,2) sage: %timeit _ = g.clustering_coeff() 10 loops, best of 3: 96.8 ms per loop sage: from sage.graphs.base.static_sparse_graph import triangles_count sage: %timeit _ = triangles_count(g) 10 loops, best of 3: 70 ms per loop
It seems that a nontrivial part of the difference in timing is taken by the construction of the.... final dictionary. Profiling the method does not show a major cost of the actual algorithm.. So well, it seems that this graph is so easy to deal with that returning the result is not negligible :P
Nathann
comment:19 Changed 5 years ago by
 Reviewers set to David Coudert
 Status changed from needs_review to positive_review
For me the patch is now good to go!
comment:20 Changed 5 years ago by
Thanks !
Nathann
comment:21 Changed 5 years ago by
 Branch changed from u/ncohen/18834 to 8c25063de223a611c202d937d828c78cf0db8bce
 Resolution set to fixed
 Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. New commits:
trac #18834: Use Sage to compute clustering coefficient