Opened 6 years ago

Closed 6 years ago

#18564 closed enhancement (fixed)

Boost Edge Connectivity

Reported by: borassi Owned by: borassi
Priority: major Milestone: sage-6.8
Component: graph theory Keywords: Boost, connectivity
Cc: ncohen, dcoudert, jdemeyer, jpflori, slelievre, Snark, vbraun, wstein Merged in:
Authors: Michele Borassi Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: fe55f76 (Commits, GitHub, GitLab) Commit: fe55f76ef07c3d38b52f7b361a7a9a6fd512be9f
Dependencies: Stopgaps:

Status badges

Description (last modified by borassi)

This is a first step in including Boost Graph Library (http://trac.sagemath.org/ticket/17966). I would like to implement the edge connectivity algorithm by converting a given Sage graph into a Boost graph, apply the Boost edge connectivity algorithm, and convert the result.

Change History (55)

comment:1 Changed 6 years ago by borassi

  • Authors set to Michele Borassi
  • Cc ncohen dcoudert jdemeyer jpflori slelievre Snark vbraun wstein added
  • Component changed from PLEASE CHANGE to graph theory
  • Description modified (diff)
  • Keywords Boost connectivity added
  • Owner changed from (none) to borassi
  • Type changed from PLEASE CHANGE to enhancement

comment:2 Changed 6 years ago by borassi

  • Branch set to u/borassi/boost_edge_connectivity

comment:3 Changed 6 years ago by git

  • Commit set to fe40c540c67848ae4cb694921501185b0cd4f045

Branch pushed to git repo; I updated commit sha1. New commits:

fe40c54Edge connectivity working

comment:4 Changed 6 years ago by borassi

This version lets us use Boost graph library to compute edge connectivity. Probably it is not the cleanest way to do it, but at least it works. I am looking for suggestions on how to improve the code, in order to make it more readable:

  • is it better to use only Boost in C++ and Cython for the interface, or write the code in C++ and use Cython to convert the results? In the first case, we might have problems with generic types.
  • should we only keep translation capabilities, or should we also add a new backend? In other words, do we need Boost for linear-time or O(n log n) time algorithms?

Some benchmark:

sage: G = graphs.RandomGNM(100,1000)
sage: %timeit G.edge_connectivity()
1 loops, best of 3: 10.7 s per loop
sage: %timeit G.to_boost_graph().edge_connectivity()
100 loops, best of 3: 2.04 ms per loop

Thank you very much!

comment:5 Changed 6 years ago by git

  • Commit changed from fe40c540c67848ae4cb694921501185b0cd4f045 to 9a0e86828eadda31e99ee203a693bb65afec4762

Branch pushed to git repo; I updated commit sha1. New commits:

9a0e868Improved code for conversion

comment:6 follow-up: Changed 6 years ago by dcoudert

Something is missing in BoostGraph

sage: H = G.to_boost_graph()
sage: H
<sage.graphs.base.boost_graph.BoostGraph object at 0x10f1a4ef0>
sage: H.
H.add_edge      H.add_vertex    H.num_edges     H.num_vertices  
sage: H.num_edges
<built-in method num_edges of sage.graphs.base.boost_graph.BoostGraph object at 0x10f1a4ef0>
sage: H.num_edges()
1000
sage: %timeit G.to_boost_graph().edge_connectivity()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-86519c65f6dd> in <module>()
----> 1 get_ipython().magic(u'timeit G.to_boost_graph().edge_connectivity()')

/Users/dcoudert/sage/local/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in magic(self, arg_s)
   2305         magic_name, _, magic_arg_s = arg_s.partition(' ')
   2306         magic_name = magic_name.lstrip(prefilter.ESC_MAGIC)
-> 2307         return self.run_line_magic(magic_name, magic_arg_s)
   2308 
   2309     #-------------------------------------------------------------------------

/Users/dcoudert/sage/local/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line)
   2226                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2227             with self.builtin_trap:
-> 2228                 result = fn(*args,**kwargs)
   2229             return result
   2230 

/Users/dcoudert/sage/local/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, line, cell)

/Users/dcoudert/sage/local/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k)
    191     # but it's overkill for just that one bit of state.
    192     def magic_deco(arg):
--> 193         call = lambda f, *a, **k: f(*a, **k)
    194 
    195         if callable(arg):

/Users/dcoudert/sage/local/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, line, cell)
   1034             number = 1
   1035             for _ in range(1, 10):
-> 1036                 time_number = timer.timeit(number)
   1037                 worst_tuning = max(worst_tuning, time_number / number)
   1038                 if time_number >= 0.2:

/Users/dcoudert/sage/local/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in timeit(self, number)
    130         gc.disable()
    131         try:
--> 132             timing = self.inner(it, self.timer)
    133         finally:
    134             if gcold:

<magic-timeit> in inner(_it, _timer)

AttributeError: 'sage.graphs.base.boost_graph.BoostGraph' object has no attribute 'edge_connectivity'
sage: 

comment:7 Changed 6 years ago by git

  • Commit changed from 9a0e86828eadda31e99ee203a693bb65afec4762 to b0ca92c0674f7fe4df454566208e3f69d5faf982

Branch pushed to git repo; I updated commit sha1. New commits:

b0ca92cAdded sig_on, sig_off, and corrected edge_connectivity

comment:8 in reply to: ↑ 6 Changed 6 years ago by borassi

Now it should work!

comment:9 Changed 6 years ago by ncohen

Hellooooo Michele,

Great work, thank you very much! If we can speedup several Sage functions the way you did for edge connectivity, the future looks bright :-P

I have been trying to avoid the creation of these two .cpp/.hpp files, but no good new for the moment. I sent a question on cython-users [1], let's pray for their wisdom. If there is no better way to do it than write cpp code, then perhaps it will be better to 'patch' boost when it is installed, and only add the couple of lines we need? We'll see this later, after they have had the time to answer.

About your code:

  • You add 'copyright' headers with a name which isn't yours: that's neither fair to Robert Miller nor to yourself! If you want your files to have a copyright header, please make it contain your name. Oh, and please steal a more recent header, for this one is a bit old. I think this one is the most recent
    #*****************************************************************************
    #       Copyright (C) 2015 Michele Borassi <email_adress>
    #
    #  Distributed under the terms of the GNU General Public License (GPL)
    #  as published by the Free Software Foundation; either version 2 of
    #  the License, or (at your option) any later version.
    #                  http://www.gnu.org/licenses/
    #*****************************************************************************
    
  • I do not think that having a Graph.to_boost_graph makes much sense: at user-level, having a 'boost' graph does not have much interest as it only has very few commands to add edges/vertices, and your edge_connectivity function. It would make more sense to keep this function inside of boost_graph.pyx, for internal use.
  • You should modify GenericGraph.edge_connectivity to make it able to call your function. Beware: this function applies to both directed and undirected graphs, while apparently your function only handles undirected graphs. The ideal, of course, would be to be able to call boost in both situations.

Many graph function make several implementations available, and for instance GenericGraph.automorphism_group.

  • Every function that you add must be documented. Besides, it would be cool to add your new file to the doc index [2].
  • Every function must contain doctests.
    (tmp|✚3)~/sage/graphs/base$ sage -coverage boost_graph.pyx
    ------------------------------------------------------------------------
    SCORE boost_graph.pyx: 0.0% (0 of 6)
    
    Missing documentation:
        * line 57: def __cinit__(self)
        * line 62: def add_vertex(self)
        * line 65: def add_edge(self, u, v)
        * line 68: def num_vertices(self)
        * line 71: def num_edges(self)
        * line 74: def edge_connectivity(self)
    ------------------------------------------------------------------------
    

Thank you very much for this improvement. That's no small speedup, and an important function.

Nathann

PS: Set the ticket's status to 'needs_review' when you want somebody to look at it.

[1] https://groups.google.com/d/topic/cython-users/ICRgCWVZ6RQ/discussion [2] http://doc.sagemath.org/html/en/developer/sage_manuals.html#adding-a-new-file

comment:10 Changed 6 years ago by ncohen

Hello Michele,

You will find at public/18564 a commit that avoids the creation of .hpp and .cpp files.

Nathann

comment:11 Changed 6 years ago by dcoudert

Hello Nathann,

your branch is working well and this is easier than the first solution proposed by Michele.

sage: G = graphs.RandomBarabasiAlbert(1000,2)
sage: %timeit G.edge_connectivity()
1 loops, best of 3: 1.02 s per loop
sage: %timeit G.to_boost_graph().edge_connectivity()
10 loops, best of 3: 46 ms per loop

I’m really impressed by what you did (both of you).

comment:12 Changed 6 years ago by git

  • Commit changed from b0ca92c0674f7fe4df454566208e3f69d5faf982 to 7d272ab6a82837daa1f296206c0ac63bb6203108

Branch pushed to git repo; I updated commit sha1. New commits:

2b85ec1trac #18564: Merged with 6.8.beta3
7d272abAvoid .hpp/.cpp files

comment:13 Changed 6 years ago by borassi

I have merged Nathann's work. I will continue from here!

comment:14 Changed 6 years ago by git

  • Commit changed from 7d272ab6a82837daa1f296206c0ac63bb6203108 to 911c3eafa2f714d32a8d4b2fa6a02793e94fe7ad

Branch pushed to git repo; I updated commit sha1. New commits:

9fc1be6Moved conversion to file boost_graph
911c3eaMultiple vertex/edge implementations working

comment:15 follow-up: Changed 6 years ago by ncohen

Just a tip: if you run 'sage -cython -a your_file.pyx', you will get a .html file which indicates how much C code it takes to repleace each of your python lines (click on the yellow lines you see).

With this, you can easily notice loops which involve Python though they should not. It is often because of variables of unspecified type.

If you do the following replacement

- N = G_sage.num_verts()
+ cdef int N = G_sage.num_verts()

you should see something happen in the loops involving N.

Nathann

comment:16 in reply to: ↑ 15 Changed 6 years ago by borassi

Error compiling Cython file:
------------------------------------------------------------
...
                
        return G
    
    def __cinit__(self):
        sig_on()
        self.graph = new UndVecGraph()
                        ^
------------------------------------------------------------

src/sage/graphs/base/boost_graph.pyx:80:25: Operation only allowed in c++

How can I tell it that the language is C++?

comment:17 Changed 6 years ago by ncohen

On my machine your code compiles.

comment:18 follow-up: Changed 6 years ago by dcoudert

The code also compiles on my computer

comment:19 in reply to: ↑ 18 ; follow-up: Changed 6 years ago by borassi

Sorry I forgot to say that this was the output to the command

./sage -cython -a src/sage/graphs/base/boost_graph.pyx

suggested by Nathann in comment 15

Replying to dcoudert:

The code also compiles on my computer

comment:20 in reply to: ↑ 19 ; follow-up: Changed 6 years ago by ncohen

I forgot to say that this was the output to the command

./sage -cython -a src/sage/graphs/base/boost_graph.pyx

Oh, right! Can you try again after adding '#clang C++' in the first line of the file?

Nathann

comment:21 in reply to: ↑ 20 ; follow-up: Changed 6 years ago by borassi

No way... I have tried the following:

'#clang C++'
#clang C++
'#distutils: language = c++'
#distutils: language = c++

The command

./sage -cython --cplus -a src/sage/graphs/base/boost_graph.pyx

seems to work, but I cannot find the html :(

comment:22 in reply to: ↑ 21 Changed 6 years ago by ncohen

seems to work, but I cannot find the html :(

Isn't it right next to boost_graph.pyx? Type "git status" in Sage's folder, it should tell you something.

Nathann

comment:23 Changed 6 years ago by git

  • Commit changed from 911c3eafa2f714d32a8d4b2fa6a02793e94fe7ad to 75cdbb6786b2cdce82715a56bcbb6924de317cf4

Branch pushed to git repo; I updated commit sha1. New commits:

a756070Boost graph/digraph working. Need to move edge connectivity algorithm
75cdbb6Edge connectivity: working and clean.

comment:24 Changed 6 years ago by borassi

  • Status changed from new to needs_review

This code should be a first draft of the final code. I have cleaned the Boost part and I have inserted the Boost edge connectivity. I have also cleaned the old edge connectivity, in order to integrate the new part. The following tests show that the cleaning did not affect speed, and that the new algorithm is much faster. Still, we might improve two points:

  • implement weighted graphs;
  • there is a (small) bottleneck when vertices = True and Boost is used.

TESTS:

sage: G = graphs.CompleteGraph(40)

sage: %timeit G.edge_connectivity(boost = False)
1 loops, best of 3: 8.84 s per loop
(OLD: 1 loops, best of 3: 8.78 s per loop)

sage: %timeit G.edge_connectivity(boost = False, vertices = True)
1 loops, best of 3: 8.82 s per loop
(OLD: 1 loops, best of 3: 8.8 s per loop)

sage: %timeit G.edge_connectivity()
1000 loops, best of 3: 445 µs per loop

sage: %timeit G.edge_connectivity(vertices = True)
100 loops, best of 3: 1.57 ms per loop

sage: G = graphs.PathGraph(100)

sage: %timeit G.edge_connectivity(boost = False)
100 loops, best of 3: 2.2 ms per loop
(OLD: 100 loops, best of 3: 2.26 ms per loop)

sage: %timeit G.edge_connectivity(boost = False, value_only = False)
10 loops, best of 3: 59.2 ms per loop
(OLD: 10 loops, best of 3: 58.9 ms per loop)

sage: %timeit G.edge_connectivity(boost = False, vertices = True)
10 loops, best of 3: 59.8 ms per loop
(OLD: 10 loops, best of 3: 58.2 ms per loop)

sage: %timeit G.edge_connectivity()
1000 loops, best of 3: 226 µs per loop

sage: %timeit G.edge_connectivity(vertices = True)
1000 loops, best of 3: 524 µs per loop

PROBLEMS

Unfortunately, the Boost edge connectivity has a bug when dealing with directed graphs, evidenced by the following C++ code. What can we do? Should we open a ticket in Boost graph library?

#include <iostream>
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/edge_connectivity.hpp>


using namespace boost;
typedef adjacency_list<vecS,vecS,bidirectionalS> MyGraph;

int main()
{
    typedef graph_traits < MyGraph >::vertex_descriptor vertex_descriptor;
    typedef graph_traits < MyGraph >::edge_descriptor edge_descriptor;

    MyGraph g;

    vertex_descriptor v1 = add_vertex(g);
    vertex_descriptor v2 = add_vertex(g);
    vertex_descriptor v3 = add_vertex(g);

    add_edge(v1, v2, g);
    add_edge(v2, v3, g);

    std::cout << "Number of vertices: " << num_vertices(g) << ".\nNumber of edges: " << num_edges(g) << ".\n";
    std::vector < edge_descriptor > disconnecting_set;

    std::cout << "The edge connectivity is "
              << edge_connectivity(g, std::back_inserter(disconnecting_set))
              << "." << std::endl;

    add_edge(v2, v1, g);
    std::cout << "Number of vertices: " << num_vertices(g) << ".\nNumber of edges: " << num_edges(g) << ".\n";

    std::cout << "Now, the edge connectivity is "
              << edge_connectivity(g, std::back_inserter(disconnecting_set))
              << "." << std::endl;
    return EXIT_SUCCESS;
}

Output:

Number of vertices: 3.
Number of edges: 2.
The edge connectivity is 1.
Number of vertices: 3.
Number of edges: 3.
Now, the edge connectivity is 0.

comment:25 follow-up: Changed 6 years ago by ncohen

  • Status changed from needs_review to needs_work

Hello Michele,

Your patch is rather big, and I think that it can be simplified:

  • There is a weird 'pass' in boost_graph.pxd
  • #clang c++ is not useful in Sage's source

Also, the header of your new module should describe what it implements

  • This will make you notice (Sphinx will scream) that whatever follows a '::' should be indented.
  • A module's documentation is no place for 'todo'. If you want to work on something, create a trac ticket.
  • Why a static 'from_sage_graph' function instead of a constructor accepting a graph as an input?
  • sig_on/sig_off -- these markers are used so that the code can be interrupted while some C code is being run. Why would you put them around functions which end instantaneously, like add_edge/add_vertex?
  • The variable 'i' is also an integer, and you should define it as 'cdef int' so that your Cython code will be better optimised.
  • Why would you define edge_connectivity_boost_* functions in generic_graph_pyx instead of doing it in the graph_boost file? You can import them if needed.
  • Why did you move the LP out of the current edge_connectivity function? Can't you leave if where it is, and at the beginning of the file call the boost functions if this is what the user wants?
  • When you submit a branch, always look at the final 'diff'. You can see it by clicking on the (green) branch's name on its ticket, and it will tell you what the reviewer sees and how he will have to figure out what your code does. Everything that is not crystal clear for you is not crystal clear for him.

Good evening,

Nathann

comment:26 Changed 6 years ago by ncohen

Finally: please report the bug you found to boost. They must fix it.

comment:27 Changed 6 years ago by git

  • Commit changed from 75cdbb6786b2cdce82715a56bcbb6924de317cf4 to e0d81e4b8f85b0bf06e9af7c081a994971ea1a25

Branch pushed to git repo; I updated commit sha1. New commits:

e0d81e4Cleaned the code according to Nathann's suggestions (except last point)

comment:28 Changed 6 years ago by git

  • Commit changed from e0d81e4b8f85b0bf06e9af7c081a994971ea1a25 to 7b62350f7dd976a7e635a455378bfa4189083cf4

Branch pushed to git repo; I updated commit sha1. New commits:

7b62350Corrected all problems oulined by Nathann.

comment:29 Changed 6 years ago by git

  • Commit changed from 7b62350f7dd976a7e635a455378bfa4189083cf4 to b513cf9f6a52e1dae5805cb622da9481069defc3

Branch pushed to git repo; I updated commit sha1. New commits:

b513cf9Removed unused import in generic_graph_pyx

comment:30 in reply to: ↑ 25 Changed 6 years ago by borassi

  • Status changed from needs_work to needs_review

Hello,

I have tried to correct your problems. More details below.

Best,

Michele

  • There is a weird 'pass' in boost_graph.pxd

Removed!

  • #clang c++ is not useful in Sage's source

Then I am afraid I have not understood your comment 20... Where should I have put it? In any case, I removed it.

  • You should add your new file to the doc index

(http://doc.sagemath.org/html/en/developer/sage_manuals.html#adding-a-new-file)

Sorry, you have already told me to do so, but since I had to take many things into account, I forgot. Now I have done it!

Also, the header of your new module should describe what it implements

I have added the list of methods implemented, as in generic_graph. Is this what you meant?

  • This will make you notice (Sphinx will scream) that whatever follows a '::'

should be indented.

Done!

  • A module's documentation is no place for 'todo'. If you want to work on

something, create a trac ticket.

Ok!

  • Why a static 'from_sage_graph' function instead of a constructor accepting a

graph as an input?

Sigh, you are right... Done!

  • sig_on/sig_off -- these markers are used so that the code can be interrupted

while some C code is being run. Why would you put them around functions which end instantaneously, like add_edge/add_vertex?

Ups, I thought they were used to handle segmentation faults... My mistake!

  • The variable 'i' is also an integer, and you should define it as 'cdef int' so

that your Cython code will be better optimised.

I didn't define it as a cdef because that was not the bottleneck of the algorithm, and I chose readability instead of efficiency. However, now I used cdef.

  • Why would you define edge_connectivity_boost_* functions in

generic_graph_pyx instead of doing it in the graph_boost file? You can import them if needed.

Because, in my opinion, it could be better to keep the Boost implementation separated from the Boost algorithms (since soon there will be many of them), and to put the algorithms in the files where they will be used. Anyway, I moved it back!

  • Why did you move the LP out of the current edge_connectivity function? Can't

you leave if where it is, and at the beginning of the file call the boost functions if this is what the user wants?

Because I do not really like functions that are "too long", and I thought that some code used with Boost could be re-used by the LP algorithm, simplifying the function. Anyway, I have returned to the old version.

  • When you submit a branch, always look at the final 'diff'. You can see it by

clicking on the (green) branch's name on its ticket, and it will tell you what the reviewer sees and how he will have to figure out what your code does. Everything that is not crystal clear for you is not crystal clear for him.

Now the diff should be cleaner. Is there any way to see the diff before pushing my changes (e.g. git diffall or something like that)?

Finally: please report the bug you found to boost. They must fix it.

I will add the bug to the Boost trac as soon as the Boost trac server (svn.boost.org/) starts to work again. If it doesn't, in a few days I will try to find other ways to contact Boost developers.

comment:31 follow-up: Changed 6 years ago by ncohen

  • Status changed from needs_review to needs_work

Helloooooooo Michele,

Thanks for this update, the code looks much better. Further remarks:

  • About clang: c++: this was useful for you to run `sage -cython -a boost_graph.pyx since this command does not read module_list.py` (in which you specified that it was a c++ file). However, there is no need to add this line for everything to work 'in sage'. That's only useful to create the html file, though you found a better way to do it with the --cplus file.
  • About cdef int i: well, you are right this is not a bottleneck at all. I just said that as a reflex, sorry ;
  • About:
    Because, in my opinion, it could be better to keep the Boost implementation
    separated from the Boost algorithms (since soon there will be many of them),
    and to put the algorithms in the files where they will be used.
    
    Hmmmm... Well, as a result we would end up with data-structure specific code in generic_graph_pyx.pyx, and that is not very pretty either. It is better to have such low-level functions close together in a data-structure specific file. If you think that the boost file will grow too large after a while, however, we can split it in two files: one for the data structure, one for the exposed algorithms. Several files from base/ already contain some algorithm.
  • About:
    Because I do not really like functions that are "too long", and I thought
    that some code used with Boost could be re-used by the LP algorithm,
    simplifying the function.
    

First, note that moving code around has a very very bad effect on the final 'diff'. The reviewer sees a big removal and a big addition, and his job is to check that no new bug was introduced, so that's quite some work only to move code around. When you only want to *move* code around, it is much easier for the reviewer if you do it in two steps:

  • Move the code *without any other modification* (easy to check with a couple of bash commands)
  • Make the small changed needed in the code to 'adapt' it to its new location. This way the review is easier.
  • You added 'boost_graph' first in its block in the graph documentation. That list is not sorted alphabetically, for the first item is 'overview of graph classes'. Could you move it below it? It is the most 'general' document.
  • Style issue, do whatever you want: instead of if not (G is None) followed by a big indented block, you can write if G is None: return

By the way, G is not None does what you want:

sage: not False
True
sage: None is True
False
sage: None is not False
True
  • Your functions are "def" functions, i.e. Python functions. If you turn them into 'cpdef' functions they will be both Python+C functions, and C code will call the C function directly
  • Could you write somewhere in the doc of your boost classes that edge labels are not supported (in particular when you build a Boost graph from a Sage graph)
  • If multiple edges and loops are supported, can you add a doctest that checks it?
  • m = max(u, v): looks useless
  • self.vertices.at(u): this sems to check that u is actually within the bounds of the vector. Why not self.vertices[u] since the function is unsafe?
  • def edge_connectivity: we try to keep the first line of a docstring to be a short sentence describing what the function does. The next paragraph can be as think as you want.
  • for i from 0 <= i < ec: that's old-style. Nowadays, it is equivalent to `for i in range(ec)` (Cython translates it 'properly').
  • Awkward question: I have to admit that I am no big fan of the huge copy/paste... Since Cython extension classes do not support templating but functions do, what about not having a BoostGraph struct instead of a class? We could have two of these struct and many functions that apply to it, each of which handles fused types. Thus, only one function instead of two, no more copy/paste?... Please tell me how you feel about that.
  • About:
    Now the diff should be cleaner. Is there any way to see the diff before
    pushing my changes (e.g. git diffall or something like that)?
    
    You can see all the changes made by your current branch compared with the latest release with 'git diff HEAD develop'. Beware: does not take uncommited changed into account.
  • Another "git cleanness" issue: having commits undo what a previous commit did is not entirely 'free'. For example, if you run 'git blame' on generic_graph.py you will notice that your name on all lines of edge_connectivity since you are the last one who touched them (you moved them twice). While this is not a very very very bad problem, we somehow lose information with things like that.

I also believe (and nobody but me is forced to follow those rules) that the git history of a branch should be made to be read rather than be the try-and-fail history of attempts to write a code.

A way to solve both problems is to *merge* commits together (a specific case of history rewriting). You can do this with 'git rebase -i develop' [1] (look for the 'squash' keyword).

If you do that, you will have to add -f (force) to your 'push' command since you will not be adding commit but overwriting existing ones.

  • from sage.graphs.digraph import DiGraph: looks useless
  • if H.is_directed(): val.append(H.strongly_connected_components()) shouldn't you be using a breadth_first_search instead of a SCC?
  • There is a delete_edges function.

Nathann

[1] https://www.atlassian.com/git/tutorials/rewriting-history/git-rebase-i/

comment:32 Changed 6 years ago by git

  • Commit changed from b513cf9f6a52e1dae5805cb622da9481069defc3 to f9beeee7fb3b29eae5e9388ff7f7d696eb241ea6

Branch pushed to git repo; I updated commit sha1. New commits:

6e9738dAfter second round of corrections (missing struct, SCC).
f9beeeeCorrected a wrong parenthesis in the doc.

comment:33 Changed 6 years ago by git

  • Commit changed from f9beeee7fb3b29eae5e9388ff7f7d696eb241ea6 to 7481889e9795e22c5d7ea9b42335f14548c9c636

Branch pushed to git repo; I updated commit sha1. New commits:

7481889Corrected SCC=>breadth_first_search

comment:34 in reply to: ↑ 31 ; follow-up: Changed 6 years ago by borassi

Hi (just to change ;-) )!

I have applied all your remarks, except the following:

Awkward question: I have to admit that I am no big fan of the huge copy/paste... Since Cython extension classes do not support templating but functions do, what about not having a BoostGraph struct instead of a class? We could have two of these struct and many functions that apply to it, each of which handles fused types. Thus, only one function instead of two, no more copy/paste?... Please tell me how you feel about that.

Nice! I will try it and I will let you know the results (it might take a while).

A way to solve both problems is to *merge* commits together (a specific case of history rewriting). You can do this with 'git rebase -i develop'

I have two problems with rebase:

  • the work can be very long, because I didn't know it, and in several commits I also did small modifications that will be difficult to replicate.
  • At the moment, git rebase does not work:
git rebase -i develop 
error: could not apply de3440a... initial implementation of unicode art 

When you have resolved this problem, run "git rebase --continue". 
If you prefer to skip this patch, run "git rebase --skip" instead. 
To check out the original branch and stop rebasing, run "git rebase --abort". 
Could not apply de3440ab223b0031d10e447d124c5f68cbb388cd... initial implementation of unicode art

`if H.is_directed(): val.append(H.strongly_connected_components())` shouldn't you be using a `breadth_first_search` instead of a `SCC`?

Could you check that now the SCC/BFS code is correct?

Thank you very much!

Michele

comment:35 in reply to: ↑ 34 ; follow-up: Changed 6 years ago by ncohen

Helloooooo,

I have two problems with rebase:

I will give it a try once I answer this message. I do not understand the error that you meet. Perhaps a not-up-to-date 'develop'? Either way, if it takes more than 30 seconds it means that something it not right.

Could you check that now the SCC/BFS code is correct?

I do not think that it is. It fails on a very simple graph

sage: digraphs.Circuit(10).edge_connectivity(boost=True,vertices=True)
The directed edge connectivity algorithm implemented in the Boost graph library is not reliable. The result could be wrong.
[1, [[0, 1]], [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], []]]

You are simply doing things in the wrong direction. You should use the *sources* of the edges as the sources of your BFS.

And you can also simplify your code a lot. Something like this should do:

a = set(H.breadth_first_search([x for x,y in edges])))
b = set(H).difference(a)

I did not try it, but I am optimistic.

Two other things:

  • There are many 'trailing whitespaces' [1] in your code. You can add '(show-trailing-whitespace t) in your ~/.emacs file (if you use emacs :-P) to see them.
  • You should add a doctest for the digraph+boost case. You would see that your code breaks.
  • Set is a combinat object that you should not use unless you need their usual over-abstraction. 'set' is a faster python set object, which does what you need in this situation.

Nathann

[1] http://programmers.stackexchange.com/questions/121555/why-is-trailing-whitespace-a-big-deal

comment:36 Changed 6 years ago by ncohen

A merged commit (all into one) at public/18564_merged

comment:37 in reply to: ↑ 35 ; follow-up: Changed 6 years ago by borassi

Hello,

{{{

sage: digraphs.Circuit(10).edge_connectivity(boost=True,vertices=True)

The directed edge connectivity algorithm implemented in the Boost graph library is not reliable. The result could be wrong.

[1, [[0, 1]], [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], []]]

}}}

I think I have problems understanding which is the desired output. It is clear that if we have a directed cycle (0,1,...,9), the connectivity is 1 (if we eliminate an edge, the graph will not be strongly connected anymore). However, I have problems understanding which sets of vertices we are looking for. For instance, if we cut (0,1), should the two sets be {1},{2,3,...,0}, or maybe {1,2,3},{4,5,...,9,0}, or any division of the remaining graph in two components is fine?

comment:38 in reply to: ↑ 37 Changed 6 years ago by ncohen

I think I have problems understanding which is the desired output. It is clear that if we have a directed cycle (0,1,...,9), the connectivity is 1 (if we eliminate an edge, the graph will not be strongly connected anymore). However, I have problems understanding which sets of vertices we are looking for. For instance, if we cut (0,1), should the two sets be {1},{2,3,...,0}, or maybe {1,2,3},{4,5,...,9,0}, or any division of the remaining graph in two components is fine?

Any partition of the vertex set into two disjoint sets A,B such that there is only one edge going from A to B is fine.

Nathann

comment:39 Changed 6 years ago by git

  • Commit changed from 7481889e9795e22c5d7ea9b42335f14548c9c636 to 5d15475d4591b889a057ae958b9ad095fa48a8e3

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

178fa09Edge connectivity through boost
5d15475More corrections of Boost edge connectivity.

comment:40 Changed 6 years ago by borassi

  • Status changed from needs_work to needs_review

Helloooooo!

I have corrected all the issues raised in previous messages. In particular:

  • I have sent an email to the Boost mailing list with the edge connectivity bug (at the moment it is waiting for moderation, I will link it asap);
  • I have removed trailing whitespaces;
  • I have added a doctest for directed Boost connectivity;
  • I have used Nathann's code from public/18564_merged.

Only issue that remains open: avoiding the huge copy-paste. Unfortunately, also structs and templates do not support fused types ![1]. As far as I understood, if we do not want to write C++ code, the only solution is to write a function for each task and for each graph type, then to create a "generic" function that inputs a fused type containing all Boost graphs (for instance, a function num_vertices(BoostVecDiGraph), a function num_vertices(BoostVecGraph), and a function num_vertices(BoostFusedGraph) wrapping the other two). With the number of vertices, this is not a big issue, but with add_edge it might be, because we also need variable vertices (which translates Sagemath vertices into Boost vertices), and we cannot use a struct (it is not supported). This might result in a huge mess, which increases even more if we have to use constructs for the specific algorithm we are "stealing" from Boost. Is it really worth it, if in some time Cython might start supporting fused types in templates?

![1] https://groups.google.com/forum/#!topic/cython-users/qQpMo3hGQqI

comment:41 Changed 6 years ago by git

  • Commit changed from 5d15475d4591b889a057ae958b9ad095fa48a8e3 to f3a1c07f9d5c4a8b7830d2b624561f8ed5fc27f9

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

f3a1c07Edge connectivity through Boost

comment:42 Changed 6 years ago by borassi

  • Description modified (diff)
  • Status changed from needs_review to needs_work

Hellooooooo!

I summarize a discussion we had via email, dealing with the huge copy-paste. We found a way to avoid it: we created a C++ class boost_interface that interfaces Cython with Boost, and we used fused types to make generic functions accessing the C++ class.

The drawback is that we cannot access Boost graphs from Python anymore: Boost graphs must be defined and deleted in the same cdef function.

The advantage is that we can fully use the genericity of Boost, by calling any graph implementation we want.

The new code is available in this ticket's branch, while the old code is still available at u/borassi/boost_edge_connectivity_not_generic.

I changed the status to needs_work, because I would like to test the code and the doctest a bit more before asking for a review (in particular, I would like to check if all previous comments are still applied).

comment:43 Changed 6 years ago by dcoudert

I like this new implementation. I assume it will be relativity easy to call other boost method the same way.

comment:44 Changed 6 years ago by jdemeyer

This comment needs to be expanded:

class BoostGraph
/*
 * This generic class wraps a Boost graph, in order to make it Cython-friendly.

From reading the code, I have no idea what is Cython-unfriendly.

comment:45 Changed 6 years ago by git

  • Commit changed from f3a1c07f9d5c4a8b7830d2b624561f8ed5fc27f9 to 51a183a065af09d203be1f01857b4045a1afee4d

Branch pushed to git repo; I updated commit sha1. New commits:

51a183aExpanded the Cython-friendly comment, linked Boost ticket, removed trailing whitespaces

comment:46 Changed 6 years ago by borassi

  • Status changed from needs_work to needs_review

Hello!

I have opened a Boost ticket on edge connectivity (https://svn.boost.org/trac/boost/ticket/11406), I have added this link in the documentation of our edge connectivity, I have expanded the comment as asked by jdemeyer, and I have corrected some small issues.

I hope that now the code is correct!

comment:47 Changed 6 years ago by dcoudert

Hello,

Install OK, tests OK, docbuild OK, page boost_graph.html looks good (it will certainly be enriched in the future), and it is fast.

sage: G = graphs.Grid2dGraph(20,20)
sage: %timeit G.edge_connectivity(boost=0)
1 loops, best of 3: 724 ms per loop
sage: %timeit G.edge_connectivity(boost=1)
100 loops, best of 3: 5.86 ms per loop

sage: G = graphs.RandomBarabasiAlbert(1000,2)
sage: %timeit G.edge_connectivity(boost=0)
1 loops, best of 3: 1.14 s per loop
sage: %timeit G.edge_connectivity(boost=1)
10 loops, best of 3: 41.4 ms per loop

For me the patch is good to go. However, it would be better if an expert in C++/boost/etc. could check the patch before setting it to positive review.

Nice work anyway!

David.

comment:48 Changed 6 years ago by ncohen

  • Reviewers set to Nathann Cohen

Helloooooo Michele,

I made several modifications, available as a new commit at public/18564. All of them are, of course, open to discussion.

  • You don't need two 'pass' in the .pxd file
  • About the paragraphs at the top of boost_graph.pyx: I find them a bit unclear. Could you be more specific in those sentences? For example (but not only), I do not get the meaning of
    and they cannot be shared by different Python functions, because they cannot
    be converted to Python objects.
    

I also turned your 'print warning' into a stopgap message.

  • I reversed the names of edge_connectivity and boost_edge_connectivity: now the boost_edge_connectivity function is the one which takes a boost graph as argument.
  • The link toward edge_connectivity that you had added at the top of the document was broken, for edge_connectivity was a cdef function in your patch, and you cannot link (in Sphinx) toward cdef functions. When yo uwork on the doc, you can compile it with --warn-links to spot broken links.
  • I shortened a bit the doc of some boost functions: some of the things you say in the doc would be more appropriate as comments in the code.
  • I renamed 'boost=True' to 'implementation="boost"', which is a bit more 'standard'. You may not know what 'boost' is, but you understand the meaning of a 'implementation' selector.

Again, don't hesitate to oppose and discuss any of the changes in this commit, as you are the reviewer of them. You are supposed to be as picky as we are when we review your code.

In my mind, the branch with this commit added is good to go, though perhaps the comments to the top of boost_graph.pyx could be made clearer. That's up to you.

Have fun, and thank you very much again for this addition. Now that this is written, it will be *much* easier to steal more boost code ;-)

Nathann

comment:49 Changed 6 years ago by git

  • Commit changed from 51a183a065af09d203be1f01857b4045a1afee4d to f0709aa5df62b6c266524099b9b6071b7c860bd0

Branch pushed to git repo; I updated commit sha1. New commits:

9063cdctrac #18564: Merged with 6.8.beta5
f0709aatrac #18564: Reviewer's commit

comment:50 follow-up: Changed 6 years ago by borassi

Hello! I have checked your code and included it into the current version. I have made a few modifications:

  • I have tried to clarify a bit more the comment at the beginning of boost_graph.pyx.
  • Martin showed me that the Boost edge connectivity code actually tells that the input graph must be undirected (Create a network flow graph out of the undirected graph FlowGraph flow_g(num_vertices(g))). Hence, I changed the stopgap sentence to "The edge connectivity of directed graphs is not implemented in Boost. The result may be mathematically unreliable." Thank you, Martin!
  • I have added some spaces at the bottom of boost_graph.pxd, so that different graph implementations are aligned.
  • Currently, our Boost code does not support edge labels. With the previous code, if we have used "implementation='boost',use_edge_labels=True" the edge labels would simply have been ignored without a warning. Now, the algorithm raises an error.
  • If num_edges()==0 and vertices=True, I have modified the output to return [0,[],[[],[]]], that is [edge_connectivity, edges, [first set of vertices, second set of vertices]].

Also for me, now the patch is good to go!

comment:51 Changed 6 years ago by git

  • Commit changed from f0709aa5df62b6c266524099b9b6071b7c860bd0 to fe55f76ef07c3d38b52f7b361a7a9a6fd512be9f

Branch pushed to git repo; I updated commit sha1. New commits:

fe55f76Few more modifications

comment:52 in reply to: ↑ 50 ; follow-up: Changed 6 years ago by ncohen

Hello Michele,

  • I have tried to clarify a bit more the comment at the beginning of boost_graph.pyx.

Thanks! That's better.

  • Martin showed me that the Boost edge connectivity code actually tells that the input graph must be undirected (Create a network flow graph out of the undirected graph FlowGraph flow_g(num_vertices(g))). Hence, I changed the stopgap sentence to "The edge connectivity of directed graphs is not implemented in Boost. The result may be mathematically unreliable." Thank you, Martin!

Oh. Does that mean that there is "no bug" in boost because they never claimed that it worked? If so, several things need to be done: 1) Close the bug report on boost's trac, as it is invalid 2) Change the milestone of our 'stopgap' ticket #18753 to invalid/wontfix (and status to positive review) 3) Remove the stopgap warning from this branch's code 4) Remove any claim that boost's algorithm is unreliable for directed graphs, and claim instead that it is only avilable for undirected graphs 5) Raise an exception when 'boost' is requested to run on a directed graph.

  • Currently, our Boost code does not support edge labels. With the previous code, if we have used "implementation='boost',use_edge_labels=True" the edge labels would simply have been ignored without a warning. Now, the algorithm raises an error.

Good. I missed that :-/

  • If num_edges()==0 and vertices=True, I have modified the output to return [0,[],[[],[]]], that is [edge_connectivity, edges, [first set of vertices, second set of vertices]].

OKayyyyyyyyyyy.

Nathann

comment:53 in reply to: ↑ 52 ; follow-up: Changed 6 years ago by borassi

  • Martin showed me that the Boost edge connectivity code actually tells that the input graph must be undirected (Create a network flow graph out of the undirected graph FlowGraph flow_g(num_vertices(g))). Hence, I changed the stopgap sentence to "The edge connectivity of directed graphs is not implemented in Boost. The result may be mathematically unreliable." Thank you, Martin!

Oh. Does that mean that there is "no bug" in boost because they never claimed that it worked? If so, several things need to be done: 1) Close the bug report on boost's trac, as it is invalid 2) Change the milestone of our 'stopgap' ticket #18753 to invalid/wontfix (and status to positive review) 3) Remove the stopgap warning from this branch's code 4) Remove any claim that boost's algorithm is unreliable for directed graphs, and claim instead that it is only avilable for undirected graphs 5) Raise an exception when 'boost' is requested to run on a directed graph.

Well, not so fast! Martin told me that the edge connectivity is a work in progress, and that they might make it work on directed graph soon. Furthermore, I still think that the Boost code *has* a bug: the comment I mentioned is a line comment in the middle of the code, and a user does not necessarily read the whole code before using it! Furthermore, the comment does not say explicitly that they do not claim it works. For this reason, I have added a comment saying this in the Boost ticket, but I think we should leave the patch as it is, waiting for the final Boost edge connectivity!

What do you think?

Thank you very much,

Michele

comment:54 in reply to: ↑ 53 Changed 6 years ago by ncohen

  • Status changed from needs_review to positive_review

Well, not so fast! Martin told me that the edge connectivity is a work in progress, and that they might make it work on directed graph soon. Furthermore, I still think that the Boost code *has* a bug: the comment I mentioned is a line comment in the middle of the code, and a user does not necessarily read the whole code before using it! Furthermore, the comment does not say explicitly that they do not claim it works. For this reason, I have added a comment saying this in the Boost ticket, but I think we should leave the patch as it is, waiting for the final Boost edge connectivity!

What do you think?

That I trust your advice on this matter.

Good to go! (and thank you very much for this first 'boost' addition)

Nathann

comment:55 Changed 6 years ago by vbraun

  • Branch changed from u/borassi/boost_edge_connectivity to fe55f76ef07c3d38b52f7b361a7a9a6fd512be9f
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.