Opened 6 years ago
Closed 6 years ago
#18564 closed enhancement (fixed)
Boost Edge Connectivity
Reported by:  borassi  Owned by:  borassi 

Priority:  major  Milestone:  sage6.8 
Component:  graph theory  Keywords:  Boost, connectivity 
Cc:  ncohen, dcoudert, jdemeyer, jpflori, slelievre, Snark, vbraun, wstein  Merged in:  
Authors:  Michele Borassi  Reviewers:  Nathann Cohen 
Report Upstream:  N/A  Work issues:  
Branch:  fe55f76 (Commits, GitHub, GitLab)  Commit:  fe55f76ef07c3d38b52f7b361a7a9a6fd512be9f 
Dependencies:  Stopgaps: 
Description (last modified by )
This is a first step in including Boost Graph Library (http://trac.sagemath.org/ticket/17966). I would like to implement the edge connectivity algorithm by converting a given Sage graph into a Boost graph, apply the Boost edge connectivity algorithm, and convert the result.
Change History (55)
comment:1 Changed 6 years ago by
 Cc ncohen dcoudert jdemeyer jpflori slelievre Snark vbraun wstein added
 Component changed from PLEASE CHANGE to graph theory
 Description modified (diff)
 Keywords Boost connectivity added
 Owner changed from (none) to borassi
 Type changed from PLEASE CHANGE to enhancement
comment:2 Changed 6 years ago by
 Branch set to u/borassi/boost_edge_connectivity
comment:3 Changed 6 years ago by
 Commit set to fe40c540c67848ae4cb694921501185b0cd4f045
comment:4 Changed 6 years ago by
This version lets us use Boost graph library to compute edge connectivity. Probably it is not the cleanest way to do it, but at least it works. I am looking for suggestions on how to improve the code, in order to make it more readable:
 is it better to use only Boost in C++ and Cython for the interface, or write the code in C++ and use Cython to convert the results? In the first case, we might have problems with generic types.
 should we only keep translation capabilities, or should we also add a new backend? In other words, do we need Boost for lineartime or O(n log n) time algorithms?
Some benchmark:
sage: G = graphs.RandomGNM(100,1000) sage: %timeit G.edge_connectivity() 1 loops, best of 3: 10.7 s per loop sage: %timeit G.to_boost_graph().edge_connectivity() 100 loops, best of 3: 2.04 ms per loop
Thank you very much!
comment:5 Changed 6 years ago by
 Commit changed from fe40c540c67848ae4cb694921501185b0cd4f045 to 9a0e86828eadda31e99ee203a693bb65afec4762
Branch pushed to git repo; I updated commit sha1. New commits:
9a0e868  Improved code for conversion

comment:6 followup: ↓ 8 Changed 6 years ago by
Something is missing in BoostGraph
sage: H = G.to_boost_graph() sage: H <sage.graphs.base.boost_graph.BoostGraph object at 0x10f1a4ef0> sage: H. H.add_edge H.add_vertex H.num_edges H.num_vertices sage: H.num_edges <builtin method num_edges of sage.graphs.base.boost_graph.BoostGraph object at 0x10f1a4ef0> sage: H.num_edges() 1000 sage: %timeit G.to_boost_graph().edge_connectivity()  AttributeError Traceback (most recent call last) <ipythoninput786519c65f6dd> in <module>() > 1 get_ipython().magic(u'timeit G.to_boost_graph().edge_connectivity()') /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/interactiveshell.pyc in magic(self, arg_s) 2305 magic_name, _, magic_arg_s = arg_s.partition(' ') 2306 magic_name = magic_name.lstrip(prefilter.ESC_MAGIC) > 2307 return self.run_line_magic(magic_name, magic_arg_s) 2308 2309 # /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line) 2226 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals 2227 with self.builtin_trap: > 2228 result = fn(*args,**kwargs) 2229 return result 2230 /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magics/execution.pyc in timeit(self, line, cell) /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magic.pyc in <lambda>(f, *a, **k) 191 # but it's overkill for just that one bit of state. 192 def magic_deco(arg): > 193 call = lambda f, *a, **k: f(*a, **k) 194 195 if callable(arg): /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magics/execution.pyc in timeit(self, line, cell) 1034 number = 1 1035 for _ in range(1, 10): > 1036 time_number = timer.timeit(number) 1037 worst_tuning = max(worst_tuning, time_number / number) 1038 if time_number >= 0.2: /Users/dcoudert/sage/local/lib/python2.7/sitepackages/IPython/core/magics/execution.pyc in timeit(self, number) 130 gc.disable() 131 try: > 132 timing = self.inner(it, self.timer) 133 finally: 134 if gcold: <magictimeit> in inner(_it, _timer) AttributeError: 'sage.graphs.base.boost_graph.BoostGraph' object has no attribute 'edge_connectivity' sage:
comment:7 Changed 6 years ago by
 Commit changed from 9a0e86828eadda31e99ee203a693bb65afec4762 to b0ca92c0674f7fe4df454566208e3f69d5faf982
Branch pushed to git repo; I updated commit sha1. New commits:
b0ca92c  Added sig_on, sig_off, and corrected edge_connectivity

comment:8 in reply to: ↑ 6 Changed 6 years ago by
Now it should work!
comment:9 Changed 6 years ago by
Hellooooo Michele,
Great work, thank you very much! If we can speedup several Sage functions the
way you did for edge connectivity, the future looks bright :P
I have been trying to avoid the creation of these two .cpp/.hpp files, but no good new for the moment. I sent a question on cythonusers [1], let's pray for their wisdom. If there is no better way to do it than write cpp code, then perhaps it will be better to 'patch' boost when it is installed, and only add the couple of lines we need? We'll see this later, after they have had the time to answer.
About your code:
 You add 'copyright' headers with a name which isn't yours: that's neither fair
to Robert Miller nor to yourself! If you want your files to have a copyright
header, please make it contain your name. Oh, and please steal a more recent
header, for this one is a bit old. I think this one is the most recent
#***************************************************************************** # Copyright (C) 2015 Michele Borassi <email_adress> # # Distributed under the terms of the GNU General Public License (GPL) # as published by the Free Software Foundation; either version 2 of # the License, or (at your option) any later version. # http://www.gnu.org/licenses/ #*****************************************************************************
 I do not think that having a
Graph.to_boost_graph
makes much sense: at userlevel, having a 'boost' graph does not have much interest as it only has very few commands to add edges/vertices, and youredge_connectivity
function. It would make more sense to keep this function inside ofboost_graph.pyx
, for internal use.
 You should modify
GenericGraph.edge_connectivity
to make it able to call your function. Beware: this function applies to both directed and undirected graphs, while apparently your function only handles undirected graphs. The ideal, of course, would be to be able to call boost in both situations.
Many graph function make several implementations available, and for instance
GenericGraph.automorphism_group
.
 Every function that you add must be documented. Besides, it would be cool to add your new file to the doc index [2].
 Every function must contain doctests.
(tmp✚3)~/sage/graphs/base$ sage coverage boost_graph.pyx  SCORE boost_graph.pyx: 0.0% (0 of 6) Missing documentation: * line 57: def __cinit__(self) * line 62: def add_vertex(self) * line 65: def add_edge(self, u, v) * line 68: def num_vertices(self) * line 71: def num_edges(self) * line 74: def edge_connectivity(self) 
Thank you very much for this improvement. That's no small speedup, and an important function.
Nathann
PS: Set the ticket's status to 'needs_review' when you want somebody to look at it.
[1] https://groups.google.com/d/topic/cythonusers/ICRgCWVZ6RQ/discussion [2] http://doc.sagemath.org/html/en/developer/sage_manuals.html#addinganewfile
comment:10 Changed 6 years ago by
Hello Michele,
You will find at public/18564 a commit that avoids the creation of .hpp and .cpp files.
Nathann
comment:11 Changed 6 years ago by
Hello Nathann,
your branch is working well and this is easier than the first solution proposed by Michele.
sage: G = graphs.RandomBarabasiAlbert(1000,2) sage: %timeit G.edge_connectivity() 1 loops, best of 3: 1.02 s per loop sage: %timeit G.to_boost_graph().edge_connectivity() 10 loops, best of 3: 46 ms per loop
I’m really impressed by what you did (both of you).
comment:12 Changed 6 years ago by
 Commit changed from b0ca92c0674f7fe4df454566208e3f69d5faf982 to 7d272ab6a82837daa1f296206c0ac63bb6203108
comment:13 Changed 6 years ago by
I have merged Nathann's work. I will continue from here!
comment:14 Changed 6 years ago by
 Commit changed from 7d272ab6a82837daa1f296206c0ac63bb6203108 to 911c3eafa2f714d32a8d4b2fa6a02793e94fe7ad
comment:15 followup: ↓ 16 Changed 6 years ago by
Just a tip: if you run 'sage cython a your_file.pyx', you will get a .html file which indicates how much C code it takes to repleace each of your python lines (click on the yellow lines you see).
With this, you can easily notice loops which involve Python though they should not. It is often because of variables of unspecified type.
If you do the following replacement
 N = G_sage.num_verts() + cdef int N = G_sage.num_verts()
you should see something happen in the loops involving N
.
Nathann
comment:16 in reply to: ↑ 15 Changed 6 years ago by
Error compiling Cython file:  ... return G def __cinit__(self): sig_on() self.graph = new UndVecGraph() ^  src/sage/graphs/base/boost_graph.pyx:80:25: Operation only allowed in c++
How can I tell it that the language is C++?
comment:17 Changed 6 years ago by
On my machine your code compiles.
comment:18 followup: ↓ 19 Changed 6 years ago by
The code also compiles on my computer
comment:19 in reply to: ↑ 18 ; followup: ↓ 20 Changed 6 years ago by
Sorry I forgot to say that this was the output to the command
./sage cython a src/sage/graphs/base/boost_graph.pyx
suggested by Nathann in comment 15
Replying to dcoudert:
The code also compiles on my computer
comment:20 in reply to: ↑ 19 ; followup: ↓ 21 Changed 6 years ago by
I forgot to say that this was the output to the command
./sage cython a src/sage/graphs/base/boost_graph.pyx
Oh, right! Can you try again after adding '#clang C++' in the first line of the file?
Nathann
comment:21 in reply to: ↑ 20 ; followup: ↓ 22 Changed 6 years ago by
No way... I have tried the following:
'#clang C++' #clang C++ '#distutils: language = c++' #distutils: language = c++
The command
./sage cython cplus a src/sage/graphs/base/boost_graph.pyx
seems to work, but I cannot find the html :(
comment:22 in reply to: ↑ 21 Changed 6 years ago by
seems to work, but I cannot find the html :(
Isn't it right next to boost_graph.pyx
? Type "git status" in Sage's folder, it should tell you something.
Nathann
comment:23 Changed 6 years ago by
 Commit changed from 911c3eafa2f714d32a8d4b2fa6a02793e94fe7ad to 75cdbb6786b2cdce82715a56bcbb6924de317cf4
comment:24 Changed 6 years ago by
 Status changed from new to needs_review
This code should be a first draft of the final code. I have cleaned the Boost part and I have inserted the Boost edge connectivity. I have also cleaned the old edge connectivity, in order to integrate the new part. The following tests show that the cleaning did not affect speed, and that the new algorithm is much faster. Still, we might improve two points:
 implement weighted graphs;
 there is a (small) bottleneck when
vertices = True
and Boost is used.
TESTS:
sage: G = graphs.CompleteGraph(40) sage: %timeit G.edge_connectivity(boost = False) 1 loops, best of 3: 8.84 s per loop (OLD: 1 loops, best of 3: 8.78 s per loop) sage: %timeit G.edge_connectivity(boost = False, vertices = True) 1 loops, best of 3: 8.82 s per loop (OLD: 1 loops, best of 3: 8.8 s per loop) sage: %timeit G.edge_connectivity() 1000 loops, best of 3: 445 µs per loop sage: %timeit G.edge_connectivity(vertices = True) 100 loops, best of 3: 1.57 ms per loop sage: G = graphs.PathGraph(100) sage: %timeit G.edge_connectivity(boost = False) 100 loops, best of 3: 2.2 ms per loop (OLD: 100 loops, best of 3: 2.26 ms per loop) sage: %timeit G.edge_connectivity(boost = False, value_only = False) 10 loops, best of 3: 59.2 ms per loop (OLD: 10 loops, best of 3: 58.9 ms per loop) sage: %timeit G.edge_connectivity(boost = False, vertices = True) 10 loops, best of 3: 59.8 ms per loop (OLD: 10 loops, best of 3: 58.2 ms per loop) sage: %timeit G.edge_connectivity() 1000 loops, best of 3: 226 µs per loop sage: %timeit G.edge_connectivity(vertices = True) 1000 loops, best of 3: 524 µs per loop
PROBLEMS
Unfortunately, the Boost edge connectivity has a bug when dealing with directed graphs, evidenced by the following C++ code. What can we do? Should we open a ticket in Boost graph library?
#include <iostream> #include <boost/graph/adjacency_list.hpp> #include <boost/graph/edge_connectivity.hpp> using namespace boost; typedef adjacency_list<vecS,vecS,bidirectionalS> MyGraph; int main() { typedef graph_traits < MyGraph >::vertex_descriptor vertex_descriptor; typedef graph_traits < MyGraph >::edge_descriptor edge_descriptor; MyGraph g; vertex_descriptor v1 = add_vertex(g); vertex_descriptor v2 = add_vertex(g); vertex_descriptor v3 = add_vertex(g); add_edge(v1, v2, g); add_edge(v2, v3, g); std::cout << "Number of vertices: " << num_vertices(g) << ".\nNumber of edges: " << num_edges(g) << ".\n"; std::vector < edge_descriptor > disconnecting_set; std::cout << "The edge connectivity is " << edge_connectivity(g, std::back_inserter(disconnecting_set)) << "." << std::endl; add_edge(v2, v1, g); std::cout << "Number of vertices: " << num_vertices(g) << ".\nNumber of edges: " << num_edges(g) << ".\n"; std::cout << "Now, the edge connectivity is " << edge_connectivity(g, std::back_inserter(disconnecting_set)) << "." << std::endl; return EXIT_SUCCESS; }
Output:
Number of vertices: 3. Number of edges: 2. The edge connectivity is 1. Number of vertices: 3. Number of edges: 3. Now, the edge connectivity is 0.
comment:25 followup: ↓ 30 Changed 6 years ago by
 Status changed from needs_review to needs_work
Hello Michele,
Your patch is rather big, and I think that it can be simplified:
 There is a weird 'pass' in boost_graph.pxd
#clang c++
is not useful in Sage's source
 You should add your new file to the doc index (http://doc.sagemath.org/html/en/developer/sage_manuals.html#addinganewfile)
Also, the header of your new module should describe what it implements
 This will make you notice (Sphinx will scream) that whatever follows a '::' should be indented.
 A module's documentation is no place for 'todo'. If you want to work on something, create a trac ticket.
 Why a static 'from_sage_graph' function instead of a constructor accepting a graph as an input?
sig_on/sig_off
 these markers are used so that the code can be interrupted while some C code is being run. Why would you put them around functions which end instantaneously, likeadd_edge/add_vertex
?
 The variable 'i' is also an integer, and you should define it as 'cdef int' so that your Cython code will be better optimised.
 Why would you define
edge_connectivity_boost_*
functions ingeneric_graph_pyx
instead of doing it in thegraph_boost
file? You can import them if needed.
 Why did you move the LP out of the current
edge_connectivity
function? Can't you leave if where it is, and at the beginning of the file call the boost functions if this is what the user wants?
 When you submit a branch, always look at the final 'diff'. You can see it by clicking on the (green) branch's name on its ticket, and it will tell you what the reviewer sees and how he will have to figure out what your code does. Everything that is not crystal clear for you is not crystal clear for him.
Good evening,
Nathann
comment:26 Changed 6 years ago by
Finally: please report the bug you found to boost. They must fix it.
comment:27 Changed 6 years ago by
 Commit changed from 75cdbb6786b2cdce82715a56bcbb6924de317cf4 to e0d81e4b8f85b0bf06e9af7c081a994971ea1a25
Branch pushed to git repo; I updated commit sha1. New commits:
e0d81e4  Cleaned the code according to Nathann's suggestions (except last point)

comment:28 Changed 6 years ago by
 Commit changed from e0d81e4b8f85b0bf06e9af7c081a994971ea1a25 to 7b62350f7dd976a7e635a455378bfa4189083cf4
Branch pushed to git repo; I updated commit sha1. New commits:
7b62350  Corrected all problems oulined by Nathann.

comment:29 Changed 6 years ago by
 Commit changed from 7b62350f7dd976a7e635a455378bfa4189083cf4 to b513cf9f6a52e1dae5805cb622da9481069defc3
Branch pushed to git repo; I updated commit sha1. New commits:
b513cf9  Removed unused import in generic_graph_pyx

comment:30 in reply to: ↑ 25 Changed 6 years ago by
 Status changed from needs_work to needs_review
Hello,
I have tried to correct your problems. More details below.
Best,
Michele
 There is a weird 'pass' in boost_graph.pxd
Removed!
#clang c++
is not useful in Sage's source
Then I am afraid I have not understood your comment 20... Where should I have put it? In any case, I removed it.
 You should add your new file to the doc index
(http://doc.sagemath.org/html/en/developer/sage_manuals.html#addinganewfile)
Sorry, you have already told me to do so, but since I had to take many things into account, I forgot. Now I have done it!
Also, the header of your new module should describe what it implements
I have added the list of methods implemented, as in generic_graph
. Is this what you meant?
 This will make you notice (Sphinx will scream) that whatever follows a '::'
should be indented.
Done!
 A module's documentation is no place for 'todo'. If you want to work on
something, create a trac ticket.
Ok!
 Why a static 'from_sage_graph' function instead of a constructor accepting a
graph as an input?
Sigh, you are right... Done!
sig_on/sig_off
 these markers are used so that the code can be interruptedwhile some C code is being run. Why would you put them around functions which end instantaneously, like
add_edge/add_vertex
?
Ups, I thought they were used to handle segmentation faults... My mistake!
 The variable 'i' is also an integer, and you should define it as 'cdef int' so
that your Cython code will be better optimised.
I didn't define it as a cdef because that was not the bottleneck of the algorithm, and I chose readability instead of efficiency. However, now I used cdef.
 Why would you define
edge_connectivity_boost_*
functions in
generic_graph_pyx
instead of doing it in thegraph_boost
file? You can import them if needed.
Because, in my opinion, it could be better to keep the Boost implementation separated from the Boost algorithms (since soon there will be many of them), and to put the algorithms in the files where they will be used. Anyway, I moved it back!
 Why did you move the LP out of the current
edge_connectivity
function? Can'tyou leave if where it is, and at the beginning of the file call the boost functions if this is what the user wants?
Because I do not really like functions that are "too long", and I thought that some code used with Boost could be reused by the LP algorithm, simplifying the function. Anyway, I have returned to the old version.
 When you submit a branch, always look at the final 'diff'. You can see it by
clicking on the (green) branch's name on its ticket, and it will tell you what the reviewer sees and how he will have to figure out what your code does. Everything that is not crystal clear for you is not crystal clear for him.
Now the diff should be cleaner. Is there any way to see the diff before pushing my changes (e.g. git diffall or something like that)?
Finally: please report the bug you found to boost. They must fix it.
I will add the bug to the Boost trac as soon as the Boost trac server (svn.boost.org/) starts to work again. If it doesn't, in a few days I will try to find other ways to contact Boost developers.
comment:31 followup: ↓ 34 Changed 6 years ago by
 Status changed from needs_review to needs_work
Helloooooooo Michele,
Thanks for this update, the code looks much better. Further remarks:
 About
clang: c++
: this was useful for you to run `sage cython a boost_graph.pyxsince this command does not read
module_list.py` (in which you specified that it was a c++ file). However, there is no need to add this line for everything to work 'in sage'. That's only useful to create the html file, though you found a better way to do it with thecplus
file.
 About
sig_on/off
: http://www.sagemath.org/documentation/html/en/developer/coding_in_cython.html#interruptandsignalhandling
 About
cdef int i
: well, you are right this is not a bottleneck at all. I just said that as a reflex, sorry ^{};
 About:
Because, in my opinion, it could be better to keep the Boost implementation separated from the Boost algorithms (since soon there will be many of them), and to put the algorithms in the files where they will be used.
Hmmmm... Well, as a result we would end up with datastructure specific code ingeneric_graph_pyx.pyx
, and that is not very pretty either. It is better to have such lowlevel functions close together in a datastructure specific file. If you think that the boost file will grow too large after a while, however, we can split it in two files: one for the data structure, one for the exposed algorithms. Several files from base/ already contain some algorithm.
 About:
Because I do not really like functions that are "too long", and I thought that some code used with Boost could be reused by the LP algorithm, simplifying the function.
First, note that moving code around has a very very bad effect on the final 'diff'. The reviewer sees a big removal and a big addition, and his job is to check that no new bug was introduced, so that's quite some work only to move code around. When you only want to *move* code around, it is much easier for the reviewer if you do it in two steps:
 Move the code *without any other modification* (easy to check with a couple of bash commands)
 Make the small changed needed in the code to 'adapt' it to its new location. This way the review is easier.
 You added 'boost_graph' first in its block in the graph documentation. That list is not sorted alphabetically, for the first item is 'overview of graph classes'. Could you move it below it? It is the most 'general' document.
 Style issue, do whatever you want: instead of
if not (G is None)
followed by a big indented block, you can writeif G is None: return
By the way,
G is not None
does what you want:sage: not False True sage: None is True False sage: None is not False True
 Your functions are "def" functions, i.e. Python functions. If you turn them into 'cpdef' functions they will be both Python+C functions, and C code will call the C function directly
 Could you write somewhere in the doc of your boost classes that edge labels are not supported (in particular when you build a Boost graph from a Sage graph)
 If multiple edges and loops are supported, can you add a doctest that checks it?
m = max(u, v)
: looks useless
self.vertices.at(u)
: this sems to check thatu
is actually within the bounds of the vector. Why notself.vertices[u]
since the function isunsafe
?
def edge_connectivity
: we try to keep the first line of a docstring to be a short sentence describing what the function does. The next paragraph can be as think as you want.
for i from 0 <= i < ec
: that's oldstyle. Nowadays, it is equivalent to `for i in range(ec)` (Cython translates it 'properly').
 Awkward question: I have to admit that I am no big fan of the huge
copy/paste... Since Cython extension classes do not support templating but
functions do, what about not having a
BoostGraph
struct instead of a class? We could have two of these struct and many functions that apply to it, each of which handles fused types. Thus, only one function instead of two, no more copy/paste?... Please tell me how you feel about that.
 About:
Now the diff should be cleaner. Is there any way to see the diff before pushing my changes (e.g. git diffall or something like that)?
You can see all the changes made by your current branch compared with the latest release with 'git diff HEAD ^{develop'. Beware: does not take uncommited changed into account. }
 Another "git cleanness" issue: having commits undo what a previous commit did
is not entirely 'free'. For example, if you run 'git blame' on
generic_graph.py
you will notice that your name on all lines ofedge_connectivity
since you are the last one who touched them (you moved them twice). While this is not a very very very bad problem, we somehow lose information with things like that.
I also believe (and nobody but me is forced to follow those rules) that the git history of a branch should be made to be read rather than be the tryandfail history of attempts to write a code.
A way to solve both problems is to *merge* commits together (a specific case of history rewriting). You can do this with 'git rebase i develop' [1] (look for the 'squash' keyword).
If you do that, you will have to add f (force) to your 'push' command since you will not be adding commit but overwriting existing ones.
from sage.graphs.digraph import DiGraph
: looks useless
if H.is_directed(): val.append(H.strongly_connected_components())
shouldn't you be using abreadth_first_search
instead of aSCC
?
 There is a
delete_edges
function.
Nathann
[1] https://www.atlassian.com/git/tutorials/rewritinghistory/gitrebasei/
comment:32 Changed 6 years ago by
 Commit changed from b513cf9f6a52e1dae5805cb622da9481069defc3 to f9beeee7fb3b29eae5e9388ff7f7d696eb241ea6
comment:33 Changed 6 years ago by
 Commit changed from f9beeee7fb3b29eae5e9388ff7f7d696eb241ea6 to 7481889e9795e22c5d7ea9b42335f14548c9c636
Branch pushed to git repo; I updated commit sha1. New commits:
7481889  Corrected SCC=>breadth_first_search

comment:34 in reply to: ↑ 31 ; followup: ↓ 35 Changed 6 years ago by
Hi (just to change ;) )!
I have applied all your remarks, except the following:
Awkward question: I have to admit that I am no big fan of the huge copy/paste... Since Cython extension classes do not support templating but functions do, what about not having a
BoostGraph
struct instead of a class? We could have two of these struct and many functions that apply to it, each of which handles fused types. Thus, only one function instead of two, no more copy/paste?... Please tell me how you feel about that.
Nice! I will try it and I will let you know the results (it might take a while).
A way to solve both problems is to *merge* commits together (a specific case of history rewriting). You can do this with 'git rebase i develop'
I have two problems with rebase:
 the work can be very long, because I didn't know it, and in several commits I also did small modifications that will be difficult to replicate.
 At the moment, git rebase does not work:
git rebase i develop error: could not apply de3440a... initial implementation of unicode art When you have resolved this problem, run "git rebase continue". If you prefer to skip this patch, run "git rebase skip" instead. To check out the original branch and stop rebasing, run "git rebase abort". Could not apply de3440ab223b0031d10e447d124c5f68cbb388cd... initial implementation of unicode art
`if H.is_directed(): val.append(H.strongly_connected_components())` shouldn't you be using a `breadth_first_search` instead of a `SCC`?
Could you check that now the SCC/BFS code is correct?
Thank you very much!
Michele
comment:35 in reply to: ↑ 34 ; followup: ↓ 37 Changed 6 years ago by
Helloooooo,
I have two problems with rebase:
I will give it a try once I answer this message. I do not understand the error that you meet. Perhaps a notuptodate 'develop'? Either way, if it takes more than 30 seconds it means that something it not right.
Could you check that now the SCC/BFS code is correct?
I do not think that it is. It fails on a very simple graph
sage: digraphs.Circuit(10).edge_connectivity(boost=True,vertices=True) The directed edge connectivity algorithm implemented in the Boost graph library is not reliable. The result could be wrong. [1, [[0, 1]], [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], []]]
You are simply doing things in the wrong direction. You should use the *sources* of the edges as the sources of your BFS.
And you can also simplify your code a lot. Something like this should do:
a = set(H.breadth_first_search([x for x,y in edges]))) b = set(H).difference(a)
I did not try it, but I am optimistic.
Two other things:
 There are many 'trailing whitespaces' [1] in your code. You can add
'(showtrailingwhitespace t)
in your~/.emacs
file (if you use emacs:P
) to see them.  You should add a doctest for the digraph+boost case. You would see that your code breaks.
Set
is a combinat object that you should not use unless you need their usual overabstraction. 'set' is a faster python set object, which does what you need in this situation.
Nathann
[1] http://programmers.stackexchange.com/questions/121555/whyistrailingwhitespaceabigdeal
comment:36 Changed 6 years ago by
A merged commit (all into one) at public/18564_merged
comment:37 in reply to: ↑ 35 ; followup: ↓ 38 Changed 6 years ago by
Hello,
{{{ sage: digraphs.Circuit(10).edge_connectivity(boost=True,vertices=True) The directed edge connectivity algorithm implemented in the Boost graph library is not reliable. The result could be wrong. [1, [[0, 1]], [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], []]] }}}
I think I have problems understanding which is the desired output. It is clear that if we have a directed cycle (0,1,...,9)
, the connectivity is 1 (if we eliminate an edge, the graph will not be strongly connected anymore). However, I have problems understanding which sets of vertices we are looking for. For instance, if we cut (0,1)
, should the two sets be {1},{2,3,...,0}
, or maybe {1,2,3},{4,5,...,9,0}
, or any division of the remaining graph in two components is fine?
comment:38 in reply to: ↑ 37 Changed 6 years ago by
I think I have problems understanding which is the desired output. It is clear that if we have a directed cycle
(0,1,...,9)
, the connectivity is 1 (if we eliminate an edge, the graph will not be strongly connected anymore). However, I have problems understanding which sets of vertices we are looking for. For instance, if we cut(0,1)
, should the two sets be{1},{2,3,...,0}
, or maybe{1,2,3},{4,5,...,9,0}
, or any division of the remaining graph in two components is fine?
Any partition of the vertex set into two disjoint sets A,B
such that there is only one edge going from A to B is fine.
Nathann
comment:39 Changed 6 years ago by
 Commit changed from 7481889e9795e22c5d7ea9b42335f14548c9c636 to 5d15475d4591b889a057ae958b9ad095fa48a8e3
comment:40 Changed 6 years ago by
 Status changed from needs_work to needs_review
Helloooooo!
I have corrected all the issues raised in previous messages. In particular:
 I have sent an email to the Boost mailing list with the edge connectivity bug (at the moment it is waiting for moderation, I will link it asap);
 I have removed trailing whitespaces;
 I have added a doctest for directed Boost connectivity;
 I have used Nathann's code from public/18564_merged.
Only issue that remains open: avoiding the huge copypaste. Unfortunately, also structs and templates do not support fused types ![1]. As far as I understood, if we do not want to write C++ code, the only solution is to write a function for each task and for each graph type, then to create a "generic" function that inputs a fused type containing all Boost graphs (for instance, a function num_vertices(BoostVecDiGraph)
, a function num_vertices(BoostVecGraph)
, and a function num_vertices(BoostFusedGraph)
wrapping the other two). With the number of vertices, this is not a big issue, but with add_edge
it might be, because we also need variable vertices
(which translates Sagemath vertices into Boost vertices), and we cannot use a struct (it is not supported). This might result in a huge mess, which increases even more if we have to use constructs for the specific algorithm we are "stealing" from Boost. Is it really worth it, if in some time Cython might start supporting fused types in templates?
![1] https://groups.google.com/forum/#!topic/cythonusers/qQpMo3hGQqI
comment:41 Changed 6 years ago by
 Commit changed from 5d15475d4591b889a057ae958b9ad095fa48a8e3 to f3a1c07f9d5c4a8b7830d2b624561f8ed5fc27f9
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
f3a1c07  Edge connectivity through Boost

comment:42 Changed 6 years ago by
 Description modified (diff)
 Status changed from needs_review to needs_work
Hellooooooo!
I summarize a discussion we had via email, dealing with the huge copypaste. We found a way to avoid it: we created a C++ class boost_interface that interfaces Cython with Boost, and we used fused types to make generic functions accessing the C++ class.
The drawback is that we cannot access Boost graphs from Python anymore: Boost graphs must be defined and deleted in the same cdef function.
The advantage is that we can fully use the genericity of Boost, by calling any graph implementation we want.
The new code is available in this ticket's branch, while the old code is still available at u/borassi/boost_edge_connectivity_not_generic
.
I changed the status to needs_work, because I would like to test the code and the doctest a bit more before asking for a review (in particular, I would like to check if all previous comments are still applied).
comment:43 Changed 6 years ago by
I like this new implementation. I assume it will be relativity easy to call other boost method the same way.
comment:44 Changed 6 years ago by
This comment needs to be expanded:
class BoostGraph /* * This generic class wraps a Boost graph, in order to make it Cythonfriendly.
From reading the code, I have no idea what is Cythonunfriendly.
comment:45 Changed 6 years ago by
 Commit changed from f3a1c07f9d5c4a8b7830d2b624561f8ed5fc27f9 to 51a183a065af09d203be1f01857b4045a1afee4d
Branch pushed to git repo; I updated commit sha1. New commits:
51a183a  Expanded the Cythonfriendly comment, linked Boost ticket, removed trailing whitespaces

comment:46 Changed 6 years ago by
 Status changed from needs_work to needs_review
Hello!
I have opened a Boost ticket on edge connectivity (https://svn.boost.org/trac/boost/ticket/11406), I have added this link in the documentation of our edge connectivity, I have expanded the comment as asked by jdemeyer, and I have corrected some small issues.
I hope that now the code is correct!
comment:47 Changed 6 years ago by
Hello,
Install OK, tests OK, docbuild OK, page boost_graph.html
looks good (it will certainly be enriched in the future), and it is fast.
sage: G = graphs.Grid2dGraph(20,20) sage: %timeit G.edge_connectivity(boost=0) 1 loops, best of 3: 724 ms per loop sage: %timeit G.edge_connectivity(boost=1) 100 loops, best of 3: 5.86 ms per loop sage: G = graphs.RandomBarabasiAlbert(1000,2) sage: %timeit G.edge_connectivity(boost=0) 1 loops, best of 3: 1.14 s per loop sage: %timeit G.edge_connectivity(boost=1) 10 loops, best of 3: 41.4 ms per loop
For me the patch is good to go. However, it would be better if an expert in C++/boost/etc. could check the patch before setting it to positive review.
Nice work anyway!
David.
comment:48 Changed 6 years ago by
 Reviewers set to Nathann Cohen
Helloooooo Michele,
I made several modifications, available as a new commit at public/18564
. All
of them are, of course, open to discussion.
 You don't need two 'pass' in the .pxd file
 About the paragraphs at the top of
boost_graph.pyx
: I find them a bit unclear. Could you be more specific in those sentences? For example (but not only), I do not get the meaning ofand they cannot be shared by different Python functions, because they cannot be converted to Python objects.
 I opened a ticket at #18753 (you are its author) to indicate that the boost bug had been reported upstream. That's part of a 'procedure' called 'stopgap', precisely meant for those situations (see http://doc.sagemath.org/html/en/developer/trac.html#stopgaps)
I also turned your 'print warning' into a stopgap message.
 I reversed the names of
edge_connectivity
andboost_edge_connectivity
: now theboost_edge_connectivity
function is the one which takes a boost graph as argument.
 The link toward
edge_connectivity
that you had added at the top of the document was broken, foredge_connectivity
was a cdef function in your patch, and you cannot link (in Sphinx) toward cdef functions. When yo uwork on the doc, you can compile it withwarnlinks
to spot broken links.
 I shortened a bit the doc of some boost functions: some of the things you say in the doc would be more appropriate as comments in the code.
 in simple cases, it is easier to use
sig_check
thansig_on/off
. Asig_check
is a way to check at a specific time whether an exception should be raised. See http://www.sagemath.org/documentation/html/en/developer/coding_in_cython.html#interruptandsignalhandling
 I renamed 'boost=True' to 'implementation="boost"', which is a bit more 'standard'. You may not know what 'boost' is, but you understand the meaning of a 'implementation' selector.
Again, don't hesitate to oppose and discuss any of the changes in this commit, as you are the reviewer of them. You are supposed to be as picky as we are when we review your code.
In my mind, the branch with this commit added is good to go, though perhaps the
comments to the top of boost_graph.pyx
could be made clearer. That's up to
you.
Have fun, and thank you very much again for this addition. Now that this is
written, it will be *much* easier to steal more boost code ;)
Nathann
comment:49 Changed 6 years ago by
 Commit changed from 51a183a065af09d203be1f01857b4045a1afee4d to f0709aa5df62b6c266524099b9b6071b7c860bd0
comment:50 followup: ↓ 52 Changed 6 years ago by
Hello! I have checked your code and included it into the current version. I have made a few modifications:
 I have tried to clarify a bit more the comment at the beginning of boost_graph.pyx.
 Martin showed me that the Boost edge connectivity code actually tells that the input graph must be undirected (
Create a network flow graph out of the undirected graph FlowGraph flow_g(num_vertices(g))
). Hence, I changed the stopgap sentence to "The edge connectivity of directed graphs is not implemented in Boost. The result may be mathematically unreliable.
" Thank you, Martin!  I have added some spaces at the bottom of boost_graph.pxd, so that different graph implementations are aligned.
 Currently, our Boost code does not support edge labels. With the previous code, if we have used
"implementation='boost',use_edge_labels=True"
the edge labels would simply have been ignored without a warning. Now, the algorithm raises an error.  If
num_edges()==0
andvertices=True
, I have modified the output to return[0,[],[[],[]]]
, that is[edge_connectivity, edges, [first set of vertices, second set of vertices]]
.
Also for me, now the patch is good to go!
comment:51 Changed 6 years ago by
 Commit changed from f0709aa5df62b6c266524099b9b6071b7c860bd0 to fe55f76ef07c3d38b52f7b361a7a9a6fd512be9f
Branch pushed to git repo; I updated commit sha1. New commits:
fe55f76  Few more modifications

comment:52 in reply to: ↑ 50 ; followup: ↓ 53 Changed 6 years ago by
Hello Michele,
 I have tried to clarify a bit more the comment at the beginning of boost_graph.pyx.
Thanks! That's better.
 Martin showed me that the Boost edge connectivity code actually tells that the input graph must be undirected (
Create a network flow graph out of the undirected graph FlowGraph flow_g(num_vertices(g))
). Hence, I changed the stopgap sentence to "The edge connectivity of directed graphs is not implemented in Boost. The result may be mathematically unreliable.
" Thank you, Martin!
Oh. Does that mean that there is "no bug" in boost because they never claimed that it worked? If so, several things need to be done: 1) Close the bug report on boost's trac, as it is invalid 2) Change the milestone of our 'stopgap' ticket #18753 to invalid/wontfix (and status to positive review) 3) Remove the stopgap warning from this branch's code 4) Remove any claim that boost's algorithm is unreliable for directed graphs, and claim instead that it is only avilable for undirected graphs 5) Raise an exception when 'boost' is requested to run on a directed graph.
 Currently, our Boost code does not support edge labels. With the previous code, if we have used
"implementation='boost',use_edge_labels=True"
the edge labels would simply have been ignored without a warning. Now, the algorithm raises an error.
Good. I missed that :/
 If
num_edges()==0
andvertices=True
, I have modified the output to return[0,[],[[],[]]]
, that is[edge_connectivity, edges, [first set of vertices, second set of vertices]]
.
OKayyyyyyyyyyy.
Nathann
comment:53 in reply to: ↑ 52 ; followup: ↓ 54 Changed 6 years ago by
 Martin showed me that the Boost edge connectivity code actually tells that the input graph must be undirected (
Create a network flow graph out of the undirected graph FlowGraph flow_g(num_vertices(g))
). Hence, I changed the stopgap sentence to "The edge connectivity of directed graphs is not implemented in Boost. The result may be mathematically unreliable.
" Thank you, Martin!Oh. Does that mean that there is "no bug" in boost because they never claimed that it worked? If so, several things need to be done: 1) Close the bug report on boost's trac, as it is invalid 2) Change the milestone of our 'stopgap' ticket #18753 to invalid/wontfix (and status to positive review) 3) Remove the stopgap warning from this branch's code 4) Remove any claim that boost's algorithm is unreliable for directed graphs, and claim instead that it is only avilable for undirected graphs 5) Raise an exception when 'boost' is requested to run on a directed graph.
Well, not so fast! Martin told me that the edge connectivity is a work in progress, and that they might make it work on directed graph soon. Furthermore, I still think that the Boost code *has* a bug: the comment I mentioned is a line comment in the middle of the code, and a user does not necessarily read the whole code before using it! Furthermore, the comment does not say explicitly that they do not claim it works. For this reason, I have added a comment saying this in the Boost ticket, but I think we should leave the patch as it is, waiting for the final Boost edge connectivity!
What do you think?
Thank you very much,
Michele
comment:54 in reply to: ↑ 53 Changed 6 years ago by
 Status changed from needs_review to positive_review
Well, not so fast! Martin told me that the edge connectivity is a work in progress, and that they might make it work on directed graph soon. Furthermore, I still think that the Boost code *has* a bug: the comment I mentioned is a line comment in the middle of the code, and a user does not necessarily read the whole code before using it! Furthermore, the comment does not say explicitly that they do not claim it works. For this reason, I have added a comment saying this in the Boost ticket, but I think we should leave the patch as it is, waiting for the final Boost edge connectivity!
What do you think?
That I trust your advice on this matter.
Good to go! (and thank you very much for this first 'boost' addition)
Nathann
comment:55 Changed 6 years ago by
 Branch changed from u/borassi/boost_edge_connectivity to fe55f76ef07c3d38b52f7b361a7a9a6fd512be9f
 Resolution set to fixed
 Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. New commits:
Edge connectivity working