Opened 5 years ago
Closed 5 years ago
#17647 closed enhancement (fixed)
Branch and Bound for vertex separation
Reported by:  dcoudert  Owned by:  

Priority:  major  Milestone:  sage6.5 
Component:  graph theory  Keywords:  
Cc:  ncohen  Merged in:  
Authors:  David Coudert  Reviewers:  Nathann Cohen 
Report Upstream:  N/A  Work issues:  
Branch:  dd12f60 (Commits)  Commit:  dd12f60ede78325f58b166096778249820307e5d 
Dependencies:  #17665  Stopgaps: 
Description (last modified by )
This patch implements a branch and bound algorithm for computing the vertex separation of directed graphs. It can be significantly faster than other methods already implemented. Furthermore, it can handle graphs with more than 31 vertices, but of course the computation time could be long.
sage: from sage.graphs.graph_decompositions import vertex_separation as VS sage: D = digraphs.RandomDirectedGNP(30,.1) sage: %time w,l = VS.vertex_separation(D); print w 4 CPU times: user 229 ms, sys: 345 ms, total: 574 ms Wall time: 575 ms sage: %time w,l = VS.vertex_separation_BAB(D); print w 4 CPU times: user 1.34 ms, sys: 32 µs, total: 1.37 ms Wall time: 1.35 ms
Change History (49)
comment:1 Changed 5 years ago by
 Branch set to u/dcoudert/17647
 Type changed from PLEASE CHANGE to enhancement
comment:2 Changed 5 years ago by
 Branch changed from u/dcoudert/17647 to public/17647
comment:3 Changed 5 years ago by
 Commit set to 9b5f33e7a67218dda91bed304b5f3e2f276b389b
comment:4 Changed 5 years ago by
 Cc ncohen added
 Description modified (diff)
 Status changed from new to needs_review
comment:5 Changed 5 years ago by
A first question: you implement a graph data structure with bitsets... but what about using this instead?
http://www.sagemath.org/doc/reference/graphs/sage/graphs/base/static_dense_graph.html
Nathann
comment:6 Changed 5 years ago by
The binary_matrix_t
data structure encodes the neighborhood of a vertex using a specific bitset, that is without using the type bitset_t
. Consequently, it is not easy to use the operations already implemented for bitset_t
(union, intersection, difference, etc.) when using binary_matrix_t
. I would have to encapsulate the rows into an instance of bitset_t
before each operation. This is certainly doable. However, I don't know how safe
it is to cast a unsigned long *
into a mp_limb_t*
. This is the main problem.
comment:7 Changed 5 years ago by
HMmmm... Agreed, but we can't keep on adding graph data structures in this graph_decomposition/ folder when we have a base/ directory where everything else is already.
Let's not think about this now. I'll do a review first.
comment:8 Changed 5 years ago by
Actually, static dense graph is used only in independent_sets.pyx
, right? and in this code, we can see many cast to mp_limb_t*
.
I think we could, in another patch, change the static dense graphs to use my data structure. This way we could clean the independent set code (remove all cast operations) and enable a direct access to bitset operations.
comment:9 Changed 5 years ago by
 Commit changed from 9b5f33e7a67218dda91bed304b5f3e2f276b389b to 9aaf80ef0d8dea032d3cda04537fb8a42e1e3311
Branch pushed to git repo; I updated commit sha1. New commits:
9aaf80e  trac #17647: Reviewer's commit

comment:10 followup: ↓ 11 Changed 5 years ago by
Hello David,
Here are some superficial changes. It's faster to implement them than to write a long comment. Fix anything that you don't agree with.
I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?
Also, I did some profiling on your code and it seems that the most timeconsuming operations is the call to popcount. I do not know where it is needed yet, but if you know how to work around that you would probably earn some perf.
Tell me what you think,
Nathann
comment:11 in reply to: ↑ 10 ; followup: ↓ 13 Changed 5 years ago by
Here are some superficial changes. It's faster to implement them than to write a long comment. Fix anything that you don't agree with.
Nice. I have just fixed a small typo.
I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?
I already though about it and I don't have a "clean" method to propose. I could for instance allocate one for all 5 arrays of n bitsets each and access to the bitsets of level *level* since the BAB has depth at most n. This way we avoid multiple allocations. If you think it is OK, I can implement it.
Also, I did some profiling on your code and it seems that the most timeconsuming operations is the call to popcount. I do not know where it is needed yet, but if you know how to work around that you would probably earn some perf.
What's the magic commande to profile this code? %prun is not working here.
The speed of the bitset_len
method depends on the system. Since patch #13352, we use the mpn methods for popcount which is expected to use __builtin_popcount
. I don't know how fast it is on my laptop. I'll check as soon as you give me the magic command ;)
Thanks, David.
comment:12 Changed 5 years ago by
 Commit changed from 9aaf80ef0d8dea032d3cda04537fb8a42e1e3311 to 0504e8f1946309d9784c4837275c919b047f9829
Branch pushed to git repo; I updated commit sha1. New commits:
0504e8f  trac #17647: small typo

comment:13 in reply to: ↑ 11 ; followup: ↓ 14 Changed 5 years ago by
Nice. I have just fixed a small typo.
Thanks
I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?
I already though about it and I don't have a "clean" method to propose. I could for instance allocate one for all 5 arrays of n bitsets each and access to the bitsets of level *level* since the BAB has depth at most n. This way we avoid multiple allocations. If you think it is OK, I can implement it.
Using bitsets that would mean dozens of allocations lines again. What is it that you use in those fast digraphs of yours that are not in the 'binary matrix' type ? Possibly the best is to implement it there and use the binary matrices?
What's the magic commande to profile this code? %prun is not working here.
Run a long command, then type 'perf top' in a console. You'll love it.
The speed of the
bitset_len
method depends on the system. Since patch #13352, we use the mpn methods for popcount which is expected to use__builtin_popcount
. I don't know how fast it is on my laptop. I'll check as soon as you give me the magic command ;)
I will probably write a "profiling tutorial" soon. It would be cool if you were willing to review it. That will be helpful, along with the "interface C code with Sage" tutorial.
Nathann
comment:14 in reply to: ↑ 13 ; followup: ↓ 15 Changed 5 years ago by
I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?
I already though about it and I don't have a "clean" method to propose. I could for instance allocate one for all 5 arrays of n bitsets each and access to the bitsets of level *level* since the BAB has depth at most n. This way we avoid multiple allocations. If you think it is OK, I can implement it.
Using bitsets that would mean dozens of allocations lines again.
should be something like that
cdef bitset_t * bitset_pool = <bitset_t *>sage_malloc(5 * n * sizeof(bitset_t)) for i from 0 <= i < 5*n: bitset_init(bitset_pool[i], n) ... ... for i from 0 <= i < 5*n: bitset_free(bitset_pool[i]) sage_free(bitset_pool)
What is it that you use in those fast digraphs of yours that are not in the 'binary matrix' type ? Possibly the best is to implement it there and use the binary matrices?
I'm using extensively bitset_union, bitset_difference, bitset_discard, bitset_add, bitset_len, bitset_first
.
What you propose is to recode what has already been properly implemented for bitsets into the binary matrix type. For instance, in module static_dense_graph.pyx
, we have code like that:
# The intersection of the common neighbors of i and j is a AND of # their respective rows. A popcount then returns its cardinality. inter = 0 for l in range(m.width): inter += __builtin_popcountl(m.rows[i][l] & m.rows[j][l])
Seriously, it would be much cleaner to use bitsets for rows and to call bitset_len
.
As I said previously, my suggestion is more to change the static_dense_graph
data_structure to use directly bitset_t
for rows.
What's the magic commande to profile this code? %prun is not working here.
Run a long command, then type 'perf top' in a console. You'll love it.
Don't have it on OSX :(
The speed of the
bitset_len
method depends on the system. Since patch #13352, we use the mpn methods for popcount which is expected to use__builtin_popcount
. I don't know how fast it is on my laptop. I'll check as soon as you give me the magic command ;)I will probably write a "profiling tutorial" soon. It would be cool if you were willing to review it. That will be helpful, along with the "interface C code with Sage" tutorial.
According some discussions on the mailing lists, we have some profiling experts that could be of more help than me. I can have a look anyway to tell if it is dummy's level ;)
David.
comment:15 in reply to: ↑ 14 Changed 5 years ago by
should be something like that
cdef bitset_t * bitset_pool = <bitset_t *>sage_malloc(5 * n * sizeof(bitset_t)) for i from 0 <= i < 5*n: bitset_init(bitset_pool[i], n) ... ... for i from 0 <= i < 5*n: bitset_free(bitset_pool[i]) sage_free(bitset_pool)
In a design like that there is a problem if some bitset_init
fails. But well. Yeah, then it would probably be cleaner anyway. Good idea.
# The intersection of the common neighbors of i and j is a AND of # their respective rows. A popcount then returns its cardinality. inter = 0 for l in range(m.width): inter += __builtin_popcountl(m.rows[i][l] & m.rows[j][l])Seriously, it would be much cleaner to use bitsets for rows and to call
bitset_len
.
With bitsets you cannot do this very computation without allocating a third bitset.
Okay, let's say that it is not important: could you turn/extend your fast_digraph
into a bitset matrix ? You can overwrite the current one if you like, nobody but me uses it anyway. This way it would work with bitsets, and you could allocate this matrix of bitsets in one instruction without having to handle the memory allocation problems in your branch and bound code.
As I said previously, my suggestion is more to change the
static_dense_graph
data_structure to use directlybitset_t
for rows.
We agree if you actually talk of changing the 'binary_matrix' type, which is then used by the static dense graph stuff.
Run a long command, then type 'perf top' in a console. You'll love it.
Don't have it on OSX :(
O_o
Try it on a real machine. That will give you a reason to install some real OS :P
According some discussions on the mailing lists, we have some profiling experts that could be of more help than me. I can have a look anyway to tell if it is dummy's level ;)
The problem with 'profiling experts' is that they don't review those patches, and if nobody does it everybody loses.
Nathann
comment:16 Changed 5 years ago by
 Commit changed from 0504e8f1946309d9784c4837275c919b047f9829 to fdef6749dd23929f9903937a82c11e52ea7ddb28
Branch pushed to git repo; I updated commit sha1. New commits:
fdef674  trac 17647: use a pool of betsets to avoid mallocs in recursive function

comment:17 followup: ↓ 18 Changed 5 years ago by
Hello,
I'm now using a pool of bitsets. I add to use type bitset_s *
, but it's working and indeed a bit faster (try with G = graphs.MycielskiGraph?(5)).
About changing the static_dense_graph
structure to use directly bitsets, should I do it in this patch or in another patch?
It's a pitty I don't have this perf top command on mac. In fact it is not installed on my linux desktop :(
Best, David.
comment:18 in reply to: ↑ 17 ; followup: ↓ 19 Changed 5 years ago by
Hello,
I'm now using a pool of bitsets. I add to use type
bitset_s *
, but it's working and indeed a bit faster (try with G = graphs.MycielskiGraph?(5)).
Good news !
About changing the
static_dense_graph
structure to use directly bitsets, should I do it in this patch or in another patch?
Could you do it in another patch, then rebase this ticket on top of it ? This way we will not have to discuss code that will be changed afterwards.
It's a pitty I don't have this perf top command on mac. In fact it is not installed on my linux desktop :(
Did you try it on a linux machine ? It really tells you all you want to know.
Nathann
comment:19 in reply to: ↑ 18 ; followup: ↓ 20 Changed 5 years ago by
About changing the
static_dense_graph
structure to use directly bitsets, should I do it in this patch or in another patch?Could you do it in another patch, then rebase this ticket on top of it ? This way we will not have to discuss code that will be changed afterwards.
OK, I will work on it. I will certainly need your help to understand the procedure to "rebase this ticket on top of the other one" ;)
It's a pitty I don't have this perf top command on mac. In fact it is not installed on my linux desktop :(
Did you try it on a linux machine ? It really tells you all you want to know.
I don't have it installed on my linux desktop(s), and since I don't know the package name, yum returns me hundreds of possible packaged containing the keyword "perf".
D.
comment:20 in reply to: ↑ 19 Changed 5 years ago by
Yo!
OK, I will work on it. I will certainly need your help to understand the procedure to "rebase this ticket on top of the other one" ;)
Easy. Let us say that this ticket is a branch A. Once the other branch will be written, name it B.
git checkout A git rebase i B
After this your branch A will be above B.
I don't have it installed on my linux desktop(s), and since I don't know the package name, yum returns me hundreds of possible packaged containing the keyword "perf".
In debian it is in the package linuxtools.
Nathann
comment:21 Changed 5 years ago by
 Commit changed from fdef6749dd23929f9903937a82c11e52ea7ddb28 to 2ad04bc49191f533041f095284a8a6ae29ab8f69
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
283e45e  trac #17665: use type Py_ssize_t and change loops

da02c12  trac #17665: fix reviewer's comments

ab26af5  17647: add FastDiGraph_bitset data structure

6316afc  17647: implement the branch and bound

621af44  17647: add documentation to the module

f6d075b  trac #17647: Reviewer's commit

c9f43e5  trac #17647: small typo

99c2cac  trac 17647: use a pool of betsets to avoid mallocs in recursive function

30f80f1  trac #17647: rebase on 17665 to use binary_matrix

2ad04bc  trac #17647: fix merge problems

comment:22 followup: ↓ 23 Changed 5 years ago by
Nathann,
I tried to do what you proposed. I don't know if the result is correct or not with regards git. If it is not the case, please help me going back to the state before the rebase.
I should certainly add a dependency with ticket #17665, right?
David.
comment:23 in reply to: ↑ 22 Changed 5 years ago by
Hello,
I tried to do what you proposed. I don't know if the result is correct or not with regards git.
It is not. If you click on 'commits' right next to the branch, you will see that some commits appear twice (like 'trac #17647: small typo')
If it is not the case, please help me going back to the state before the rebase.
It is very difficult for me to tell you what to do now, considering that I have no idea how you ended up with this branch in the first place. It seems that all you did wrong was to merge two branches together in the end (you merged a branch with a copy of that branch, apparently). So I would say that all you need to do is to go back before that happened, i.e.:
git checkout 30f80f1c4158b512310cde25b32d311a92220e2e # go back to 'trac #17647: rebase on 17665 to use binary_matrix' git push trac HEAD:public/17647 f # update the distant branch
Please doublecheck that, for it will erase commits from the trac server.
I should certainly add a dependency with ticket #17665, right?
Yes.
Nathann
comment:24 Changed 5 years ago by
 Commit changed from 2ad04bc49191f533041f095284a8a6ae29ab8f69 to 30f80f1c4158b512310cde25b32d311a92220e2e
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
comment:26 Changed 5 years ago by
 Dependencies changed from 17665 to #17665
comment:27 Changed 5 years ago by
Given that this branch contains many commits (some of which undo what is done by the previous ones), I merged them all into one, available at public/17647b.
Nathann
comment:28 followup: ↓ 29 Changed 5 years ago by
Shouldn't you change the branch name in the ticket?
comment:29 in reply to: ↑ 28 Changed 5 years ago by
 Branch changed from public/17647 to public/17647b
 Commit changed from 30f80f1c4158b512310cde25b32d311a92220e2e to fdf3c001ccf9deb6fba3653532f762f4c7c39788
Shouldn't you change the branch name in the ticket?
Well, I usually let the author do that if he likes the idea ^^;
Nathann
New commits:
fdf3c00  trac #17647: Branch and Bound for vertex separation

comment:30 Changed 5 years ago by
If it's more "clean", I like the idea ;) Thanks.
comment:31 followup: ↓ 32 Changed 5 years ago by
 Status changed from needs_review to needs_work
Hello!
A first pass:
 Some modifications that you did to the header of
fast_digraph.pyx
should be undone now that you usebinary_matrix
.
 I do not know whether
positions
is actually needed in this code, but I still do not understand it totally. It seems that you enumerate all permutations, cutting branches from time to time: the code is in its principle very similar from the bandwidth code that was recently included. That code does not need such an array: do you think that it could also be removed from here?
 in
_my_invert_positions
: you cannot in C, but in Cython you can doa,b=b,a
.
best_seq
has no reason to be a C array. If you turn it into a Python list, you will remove several lines of alloc/dealloc code.
 Not worth changing it now, but try to use
:class:`class_name`
instead of``class_name``
in the doc.
 Can you document the parameters taken by
vertex_separation_BAB_C
?bm_pool
is not that selfexplaining (what are its dimensions? its role?)
 Do you really need both
current_prefix
andcurrent_other
? They seem to be the complement of each other at all times.
 What about removing "current" from
loc_b_current_prefix,loc_b_current_neighborhood,loc_b_current_other,b_current_prefix,b_current_neighborhood,b_current_other,current_prefix
? It seems to me thatb_prefix, b_prefix_neighborhood
,b_nonprefix
would be equally meaningful and 'lighter' to read?
 It seems that
current_cost
is the cost ofbest_seq
. What aboutbest_cost
?
order = [int_to_vertex[best_seq[i]] for i in range(n)]
is better than
cdef list order = list() for 0 <= i < n: order.append(int_to_vertex[best_seq[i]])
 Instead of
somemore
, use abreak
ifselect_it is False
.
Nathann
comment:32 in reply to: ↑ 31 ; followup: ↓ 35 Changed 5 years ago by
Replying to ncohen:
Hello!
A first pass:
 Some modifications that you did to the header of
fast_digraph.pyx
should be undone now that you usebinary_matrix
.
done
 I do not know whether
positions
is actually needed in this code, but I still do not understand it totally. It seems that you enumerate all permutations, cutting branches from time to time: the code is in its principle very similar from the bandwidth code that was recently included. That code does not need such an array: do you think that it could also be removed from here?
I don't see how.
This code is different from the bandwidth
one because I have the greedy steps.
Using the positions
array, I know exactly where are the vertices in the array current_prefix
(now prefix
). This way I don't have to undo the permutations when I track back or after a recursive call.
So yes it is possible to get rid of this array, but the price is to undo all permutations which is particularly painful with the greedy steps.
I definitively prefer my solution.
 in
_my_invert_positions
: you cannot in C, but in Cython you can doa,b=b,a
.
I don't see how to use it here.
best_seq
has no reason to be a C array. If you turn it into a Python list, you will remove several lines of alloc/dealloc code.
But if it is a Python list, method vertex_separation_BAB_C
has to return both the value and that list. How would that be more efficient than passing a pointer and updating the content of the array when needed ?
 Not worth changing it now, but try to use
:class:`class_name`
instead of``class_name``
in the doc.
 Can you document the parameters taken by
vertex_separation_BAB_C
?bm_pool
is not that selfexplaining (what are its dimensions? its role?)
I added some documentation. Let me know if it is enough.
 Do you really need both
current_prefix
andcurrent_other
? They seem to be the complement of each other at all times.
Right, current_other
is not used at all. Removed.
 What about removing "current" from
loc_b_current_prefix,loc_b_current_neighborhood,loc_b_current_other,b_current_prefix,b_current_neighborhood,b_current_other,current_prefix
? It seems to me thatb_prefix, b_prefix_neighborhood
,b_nonprefix
would be equally meaningful and 'lighter' to read?
I have remove current_
from many variables.
 It seems that
current_cost
is the cost ofbest_seq
. What aboutbest_cost
?
current_cost
is the cost of the current prefix. Therefore, I think the name is appropriate.
order = [int_to_vertex[best_seq[i]] for i in range(n)]
is better thancdef list order = list() for 0 <= i < n: order.append(int_to_vertex[best_seq[i]])
Changed
 Instead of
somemore
, use abreak
ifselect_it is False
.
I had to find another way since I had a for loop inside a while loop. Furthermore, the condition if select_it is False then break
is incorrect in this case.
Now I have removed the for loop and I only have the while without break.
Thanks, David.
comment:33 Changed 5 years ago by
 Commit changed from fdf3c001ccf9deb6fba3653532f762f4c7c39788 to fdfe120bc438b041a7ff480a288f29d6c73b17e5
Branch pushed to git repo; I updated commit sha1. New commits:
a267cea  trac #17647: undo changes on fast_digraph.pyx

e1096d2  trac #17647: simplify loop of greedy steps

07da3cd  trac #17647: documents input parameters of vertex_separation_BAB_C

bd2000f  trac #17647: remove useless variable current_other and update the code accordingly

fdfe120  trac #17647: simplification of the names of some variables

comment:34 Changed 5 years ago by
 Status changed from needs_work to needs_review
comment:35 in reply to: ↑ 32 Changed 5 years ago by
Hello,
 in
_my_invert_positions
: you cannot in C, but in Cython you can doa,b=b,a
.I don't see how to use it here.
source_pos = positions[i] j = prefix[target_pos] prefix[target_pos],prefix[source_pos] = prefix[source_pos], prefix[target_pos] positions[i],positions[j] = positions[j],positions[i]
It could be made simpler if the code of _my_invert_positions
was taking two positions (or two vertices) as argument instead of 'one of each type'.
But if it is a Python list, method
vertex_separation_BAB_C
has to return both the value and that list.
e = [1,2,3] def invert_two_entries(l): l[0],l[1] = l[1],l[0] invert_two_entries(e) print e # prints [2, 1, 3]
How would that be more efficient than passing a pointer and updating the content of the array when needed ?
It is not about efficiency, it is about not allocating/deallocating something if it can be avoided. This table is updated very rarely, so it does not have to be a C array.
Now I have removed the for loop and I only have the while without break.
Great! I'll resume the review.
Nathann
comment:36 followup: ↓ 37 Changed 5 years ago by
 Status changed from needs_review to needs_work
Helloooooo again,
 What is the purpose of
lower_bound
? The documentation says that the code is trying to find an ordering whose cost is *lower* thanlower_bound
? On what exactly is it a lower bound?
Do you mean that the code will "return any ordering whose cost is lower than
lower_bound
, even if it has no proof that it is optimal?
 Do you think that you actually need
loc_b_neighborhood
? It feels like all you need is what you store inbits_tmp
, i.e. the union of the prefix and the neighborhood. And you could drop one of your two tmp bitets.
 Are those two lines needed?
delta_i = max(current_cost, delta_i) if delta_i < upper_bound:
It seems that you can already suppose that
current_cost<upper_bound
, and you checked several lines above thatdelta_i < upper_bound
.
 I do not agree with this comment
if upper_bound <= lower_bound: # Either we have optimal solution, or we are satisfied with # current solution.
it seems that this code only deals with the case when 'we are satisfied with the current solution'.
 Is there any reason why you did not expose your function in the main
vertex_separation
function? If you do, is there any reason why it should not be the default one? (if it is faster?..). Finally, note that if you do make it the default some doctests of your functions will have to be updated, as you would be comparing the BAB function withvertex_separation
(which would also call the BAB).
With those comments and those from my previous message, I would say that my review is finished. We only have to settle those last remarks.
Thanks for your work!
Nathann
comment:37 in reply to: ↑ 36 ; followup: ↓ 40 Changed 5 years ago by
Replying to ncohen:
Hello,
 in
_my_invert_positions
: you cannot in C, but in Cython you can doa,b=b,a
.I don't see how to use it here.
source_pos = positions[i] j = prefix[target_pos] prefix[target_pos],prefix[source_pos] = prefix[source_pos], prefix[target_pos] positions[i],positions[j] = positions[j],positions[i]It could be made simpler if the code of
_my_invert_positions
was taking two positions (or two vertices) as argument instead of 'one of each type'.
I have change the method and updated calls to it.
But if it is a Python list, method
vertex_separation_BAB_C
has to return both the value and that list.e = [1,2,3] def invert_two_entries(l): l[0],l[1] = l[1],l[0] invert_two_entries(e) print e # prints [2, 1, 3]How would that be more efficient than passing a pointer and updating the content of the array when needed ?
It is not about efficiency, it is about not allocating/deallocating something if it can be avoided. This table is updated very rarely, so it does not have to be a C array.
Apparently your care a lot about reducing mallocs.
I changed it. Replying to ncohen:
Helloooooo again,
 What is the purpose of
lower_bound
? The documentation says that the code is trying to find an ordering whose cost is *lower* thanlower_bound
? On what exactly is it a lower bound?Do you mean that the code will "return any ordering whose cost is lower than
lower_bound
, even if it has no proof that it is optimal?
Exactly.
With parameter upper_bound
, you ask for a solution with width less than upper_bound
if such a solution exists.
With parameter lower_bound
, you say that you are satisfied with any solution with width less than this bound. This is useful in particular when you split the input digraph into strongly connected components and solve the problem on each of them. If one component has width k
, you don't care if another component has width k1
since you know that the vertex separation of the entire digraph is at least k
.
This will be useful when we will start adding preprocessing.
 Do you think that you actually need
loc_b_neighborhood
? It feels like all you need is what you store inbits_tmp
, i.e. the union of the prefix and the neighborhood. And you could drop one of your two tmp bitets.
I succeed to remove it. I have changed the name of some variables and updated the size of bm_pool
accordingly.
 Are those two lines needed?
delta_i = max(current_cost, delta_i) if delta_i < upper_bound:It seems that you can already suppose that
current_cost<upper_bound
, and you checked several lines above thatdelta_i < upper_bound
.
If we reduce the upper bound during the recursive call of the kth element of the list, we want to prune (cut) branches that are no longer useful. So yes, these lines are needed.
 I do not agree with this comment
if upper_bound <= lower_bound: # Either we have optimal solution, or we are satisfied with # current solution.it seems that this code only deals with the case when 'we are satisfied with the current solution'.
I changed the sentence to We are satisfied with current solution
 Is there any reason why you did not expose your function in the main
vertex_separation
function? If you do, is there any reason why it should not be the default one? (if it is faster?..). Finally, note that if you do make it the default some doctests of your functions will have to be updated, as you would be comparing the BAB function withvertex_separation
(which would also call the BAB).
Yep, I have many reasons for not exposing the BAB in the main vertex_separation
function yet.
 this patch is already quite long and technical. Since you often complain about patch bombs, I did my best to decompose the work
 I plan to add preprocessing rules. The first one is to decompose the input digraph into strongly connected components, but I have others, more complex. I will work on this part only when the BAB will be finalized (including memoization of prefixes). I believe its better to wok this way.
With those comments and those from my previous message, I would say that my review is finished. We only have to settle those last remarks.
Thanks for your work!
Nathann
I hope I have answered all your comments ;)
David.
comment:38 Changed 5 years ago by
 Commit changed from fdfe120bc438b041a7ff480a288f29d6c73b17e5 to 2f0bf7a386eff5ff28a9ce87106ab50ab5c5488a
comment:39 Changed 5 years ago by
 Reviewers set to Nathann Cohen
 Status changed from needs_work to needs_review
comment:40 in reply to: ↑ 37 ; followup: ↓ 41 Changed 5 years ago by
Hello,
Apparently your care a lot about reducing mallocs.
Having a short code that only does what is needed helps a lot when you try to understand someboy else's code.
Do you mean that the code will "return any ordering whose cost is lower than
lower_bound
, even if it has no proof that it is optimal?Exactly.
Okay I understand. That helps when you split your first graph into subproblems. This being said, the term lower_bound
seems to indicate that the user can provide a lower bound on the optimal value of the graph, which confused me when I read code that dealt with returning any "solution below the lower bound". What would you think of renaming it to something like "accept_any_below=4"? That's the best keyword I found, but if you have in mind something which could represent a bit better what it does... I find this name confusing.
 Are those two lines needed?
delta_i = max(current_cost, delta_i) if delta_i < upper_bound:If we reduce the upper bound during the recursive call of the kth element of the list, we want to prune (cut) branches that are no longer useful. So yes, these lines are needed.
Oh I see. I had not noticed that upper_bound
was updated at every loop. Consequently, wouldn't it be better to do something like that instead?
if delta_i >= upper_bound: break <continue exploring>
Yep, I have many reasons for not exposing the BAB in the main
vertex_separation
function yet.
 this patch is already quite long and technical. Since you often complain about patch bombs, I did my best to decompose the work
 I plan to add preprocessing rules. The first one is to decompose the input digraph into strongly connected components, but I have others, more complex. I will work on this part only when the BAB will be finalized (including memoization of prefixes). I believe its better to wok this way.
Okay, as you like. As long as you plan to do it in the short future, you can manage the patches as you like.
Nathann_from_Nantes
comment:41 in reply to: ↑ 40 ; followup: ↓ 43 Changed 5 years ago by
Do you mean that the code will "return any ordering whose cost is lower than
lower_bound
, even if it has no proof that it is optimal?Exactly.
Okay I understand. That helps when you split your first graph into subproblems. This being said, the term
lower_bound
seems to indicate that the user can provide a lower bound on the optimal value of the graph, which confused me when I read code that dealt with returning any "solution below the lower bound". What would you think of renaming it to something like "accept_any_below=4"? That's the best keyword I found, but if you have in mind something which could represent a bit better what it does... I find this name confusing.
It is used for both usage: If a lower bound is known, then the user can provide it to the BAB to stop if a solution reaches this bound. It can also be used to say "I'm happy with any solution with cost less than k".
I propose to rename it cut_off
as for shortests paths algorithms.
I have updated the doc and tests accordingly
 Are those two lines needed?
delta_i = max(current_cost, delta_i) if delta_i < upper_bound:If we reduce the upper bound during the recursive call of the kth element of the list, we want to prune (cut) branches that are no longer useful. So yes, these lines are needed.
Oh I see. I had not noticed that
upper_bound
was updated at every loop. Consequently, wouldn't it be better to do something like that instead?if delta_i >= upper_bound: break <continue exploring>
Right, done.
Yep, I have many reasons for not exposing the BAB in the main
vertex_separation
function yet.
 this patch is already quite long and technical. Since you often complain about patch bombs, I did my best to decompose the work
 I plan to add preprocessing rules. The first one is to decompose the input digraph into strongly connected components, but I have others, more complex. I will work on this part only when the BAB will be finalized (including memoization of prefixes). I believe its better to wok this way.
Okay, as you like. As long as you plan to do it in the short future, you can manage the patches as you like.
Nathann_from_Nantes
Let us know if you meet Lulu ;)
Thank you again for the review.
David.
comment:42 Changed 5 years ago by
 Commit changed from 2f0bf7a386eff5ff28a9ce87106ab50ab5c5488a to 0bd12bf4ce6c006d60aecaaad042878052d9d395
Branch pushed to git repo; I updated commit sha1. New commits:
0bd12bf  trac #17647: review commit

comment:43 in reply to: ↑ 41 Changed 5 years ago by
 Status changed from needs_review to positive_review
Yoooooooooooo !
Thank you again for the review.
And thank you for the code ;)
Nathann
comment:44 Changed 5 years ago by
Youhou!
comment:45 Changed 5 years ago by
 Status changed from positive_review to needs_work
PDF docs don't build
comment:46 Changed 5 years ago by
 Commit changed from 0bd12bf4ce6c006d60aecaaad042878052d9d395 to dd12f60ede78325f58b166096778249820307e5d
Branch pushed to git repo; I updated commit sha1. New commits:
dd12f60  trac #17647: resolve latex compilation error

comment:48 Changed 5 years ago by
 Status changed from needs_review to positive_review
comment:49 Changed 5 years ago by
 Branch changed from public/17647b to dd12f60ede78325f58b166096778249820307e5d
 Resolution set to fixed
 Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. New commits:
17647: add FastDiGraph_bitset data structure
17647: implement the branch and bound
17647: add documentation to the module