Opened 9 years ago

Closed 9 years ago

#12743 closed enhancement (fixed)

Addition of reduction rules as pre-processing of the vertex cover function

Reported by: dcoudert Owned by: jason, ncohen, rlm
Priority: major Milestone: sage-5.1
Component: graph theory Keywords:
Cc: Merged in: sage-5.1.beta0
Authors: David Coudert Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by dcoudert)

This patch does the following

  1. Moves the vertex_cover function from generic_graph.py to graph.py. The function is valid only for graphs and the independent_set function was already in graph.py.
  2. Simplifies the independent_set function so that it calls the vertex_cover function.
  3. Adds basic reduction rules as a pre-processing of the vertex cover function. These reduction rules are the 4 basic rules described for instance in http://people.scs.carleton.ca/~dehne/projects/clustalxp/ACFLSS.pdf, that is:
    1. Vertices of degree 0 are not part of the cover
    2. The neighbor of a vertex of degree 1 is in the cover
    3. If the neighbors of a degree 2 vertex are incident, they are in the cover
    4. If the two neighbors v,w of a degree 2 vertex u are not incident, then either u is in the cover or v and w are in the cover.

This is particularly useful for sparse graphs.

More advanced reduction rules can also be considered (crown decompositions, etc.) but are not part of this patch.

APPLY:

Attachments (5)

trac_12743_reduction_rules_for_vertex_cover.patch (20.9 KB) - added by dcoudert 9 years ago.
trac_12743-review2.patch (2.7 KB) - added by dcoudert 9 years ago.
trac_12743-review.patch (9.1 KB) - added by ncohen 9 years ago.
trac_12743-doctest.patch (2.8 KB) - added by ncohen 9 years ago.
trac_12743-combined.patch (22.2 KB) - added by dcoudert 9 years ago.

Download all attachments as: .zip

Change History (34)

comment:1 Changed 9 years ago by dcoudert

  • Description modified (diff)

comment:2 Changed 9 years ago by dcoudert

  • Description modified (diff)
  • Status changed from new to needs_review

comment:3 Changed 9 years ago by dcoudert

Some running time examples:

sage: G = graphs.RandomTree(100)
sage: %time G.vertex_cover()
CPU times: user 0.21 s, sys: 0.00 s, total: 0.21 s
Wall time: 0.21 s
set([0, 1, 2, 5, 6, 9, 11, 12, 14, 17, 18, 19, 22, 23, 28, 29, 31, 41, 44, 46, 47, 51, 54, 56, 57, 60, 61, 63, 64, 66, 67, 70, 72, 74, 75, 76, 79, 80, 81, 82, 85, 88, 92, 96, 98])
sage: %time G.vertex_cover(reduction_rules = True)
CPU times: user 0.04 s, sys: 0.00 s, total: 0.04 s
Wall time: 0.04 s
set([1, 2, 5, 6, 7, 8, 9, 11, 12, 13, 14, 17, 18, 19, 21, 22, 27, 39, 41, 44, 47, 51, 53, 54, 55, 56, 60, 61, 62, 63, 65, 67, 71, 72, 74, 81, 82, 85, 88, 89, 92, 93, 95, 96, 98])
sage: G = graphs.GridGraph([10,10])
sage: %time G.vertex_cover()
CPU times: user 0.19 s, sys: 0.00 s, total: 0.19 s
Wall time: 0.19 s
set([(7, 3), (1, 3), (9, 1), (4, 8), (2, 8), (8, 0), (7, 7), (4, 6), (2, 6), (5, 1), (6, 2), (3, 7), (4, 0), (3, 1), (5, 5), (0, 6), (6, 6), (4, 4), (1, 5), (2, 2), (8, 6), (5, 3), (1, 1), (9, 7), (6, 4), (0, 0), (8, 2), (7, 1), (9, 3), (6, 0), (3, 5), (5, 9), (7, 5), (1, 9), (0, 4), (0, 8), (3, 3), (6, 8), (5, 7), (9, 9), (4, 2), (0, 2), (2, 0), (8, 8), (7, 9), (1, 7), (9, 5), (3, 9), (2, 4), (8, 4)])
sage: %time G.vertex_cover(reduction_rules = True)
CPU times: user 0.17 s, sys: 0.00 s, total: 0.17 s
Wall time: 0.17 s
set([(7, 8), (6, 9), (3, 0), (9, 8), (3, 2), (0, 7), (8, 9), (1, 6), (9, 4), (2, 5), (8, 5), (0, 3), (7, 2), (1, 2), (7, 4), (9, 0), (6, 7), (2, 9), (8, 1), (7, 6), (6, 3), (5, 6), (5, 8), (3, 6), (4, 1), (0, 1), (5, 4), (5, 0), (4, 5), (1, 4), (0, 5), (2, 1), (8, 7), (4, 9), (1, 0), (9, 6), (6, 5), (2, 7), (8, 3), (7, 0), (4, 7), (9, 2), (5, 2), (6, 1), (3, 8), (1, 8), (4, 3), (0, 9), (2, 3), (3, 4)])
sage: G = graphs.RandomGNP(200,0.01)
sage: %time G.vertex_cover(value_only = True)
CPU times: user 0.75 s, sys: 0.00 s, total: 0.75 s
Wall time: 0.76 s
72
sage: %time G.vertex_cover(reduction_rules = True, value_only=True)
CPU times: user 0.10 s, sys: 0.00 s, total: 0.10 s
Wall time: 0.10 s
72

comment:4 Changed 9 years ago by dcoudert

  • Description modified (diff)

comment:5 Changed 9 years ago by dcoudert

Removal of some trailing white spaces identified by patchbot.

comment:6 follow-up: Changed 9 years ago by ncohen

  • Status changed from needs_review to needs_work

Hellooooooooooo !!!

A few comments about this patch :

  • Why do you check in independent_set whether the value given by "algorithm" is good ? vertex_cover is already dealing with that, so if you forward something else the function already raises an exception
  • I hesitated myself between returning lists or sets but now that all graph functions return lists I guess it is best to stick with it.. Wrapping the return value with a list() looks best :-p
  • Gosh your docstrings are now totally spotless.. This, or we make the same mistakes, but they look totally clean to me :-)
  • Variable rules_are_used initialized to True. What about using reduction_rules instead, as it alrady exists ?
  • As it is, if the given graph is a tree you compute n/2 times the list of vertices of degree 1. That's a bit too much. I think that
    V = [ u for u in g.vertices() if g.degree(u) == 1 ]
    
    Should become
    V = self.cores(2)[0]
    
    The advantage is that this core decomposition is recursive, so after the for loop there can be no vertex of degree 1 left in the graph
    sage: g = graphs.RandomTree(50)
    sage: g.cores(2)
    ([0, 1, 6, 14, 16, 17, 20, 23, 24, 25, 26, 29, 31, 32, 33, 34, 36, 42, 44, 46, 49, 43, 48, 5, 38, 22, 3, 7, 12, 8, 21, 10, 18, 39, 47, 19, 11, 45, 28, 35, 15, 9, 13, 30, 27, 2, 40, 41, 37, 4], [])
    
    Then, instead of removing only one vertex from V you could write "for v n V" and apply you rule on each of them iteratively (after checking that they have not been removed already, or that their neighbor has. With a bit of luck it should also be faster (though perhaps this difference is actually hard to measure...).
  • if v in g.neighbors(w): . This is True mathematically speaking, but a sin from the programming point of view. You are actually building the list of neighbors of w, then checking whether v belongs to the list when you mean "g.has_edge(v,w)" :-p
  • Could you also add in a "TESTS::" section a small "for" loop ensuring that that the algorithms outputs the same cardinalities when reduction_rules is enabled or not ? Something like what appears at the bottom of GenericGraph?.traveling_salesman_problem.

All in all, a good patch :-)

Nathann

comment:7 in reply to: ↑ 6 ; follow-up: Changed 9 years ago by dcoudert

  • Status changed from needs_work to needs_review

Replying to ncohen:

  • Why do you check in independent_set whether the value given by "algorithm" is good ? vertex_cover is already dealing with that, so if you forward something else the function already raises an exception

I have removed the test.

  • I hesitated myself between returning lists or sets but now that all graph functions return lists I guess it is best to stick with it.. Wrapping the return value with a list() looks best :-p

Changed to list

  • Variable rules_are_used initialized to True. What about using reduction_rules instead, as it alrady exists ?

I don't like the idea since in future improvements of the function one may need the original value of `reduction_rules` later in the algorithm.

  • As it is, if the given graph is a tree you compute n/2 times the list of vertices of degree 1. That's a bit too much. ...... Then, instead of removing only one vertex from V you could write "for v n V" and apply you rule on each of them iteratively (after checking that they have not been removed already, or that their neighbor has. With a bit of luck it should also be faster (though perhaps this difference is actually hard to measure...).

I'm now using the cores.

Note however that what you said is not exact. When removing the neighbor of a degree one vertex from the graph, it is possible reduce the degree of some vertices to 0 or 1. These vertices are not identified by the cores function. They will thus be considered only at next iteration of the main while loop.

  • if v in g.neighbors(w): . This is True mathematically speaking, but a sin from the programming point of view. You are actually building the list of neighbors of w, then checking whether v belongs to the list when you mean "g.has_edge(v,w)" :-p

I was not sure of the best solution to choose when writing the function because I don't know how the neighbors function is written. I did proposed modification.

  • Could you also add in a "TESTS::" section a small "for" loop ensuring that that the algorithms outputs the same cardinalities when reduction_rules is enabled or not ? Something like what appears at the bottom of GenericGraph? .traveling_salesman_problem.

Done. I have also move some test from independent_set to vertex_cover.

I have also updated the tests in the tsp function, translating sentences from french to english ;-)

Thanks for your help Nathann, and congratulation for the CNRS ranking !!!

comment:8 in reply to: ↑ 7 Changed 9 years ago by ncohen

Hellooooooooo !!!

I don't like the idea since in future improvements of the function one may need the original value of `reduction_rules` later in the algorithm.

Totally right :-)

I'm now using the cores.

+1

Note however that what you said is not exact. When removing the neighbor of a degree one vertex from the graph, it is possible reduce the degree of some vertices to 0 or 1. These vertices are not identified by the cores function. They will thus be considered only at next iteration of the main while loop.

Oh right, you also remove neighbors... Well, at least for trees everything is removed in only one run :-D

But actually, that was the whole point of my first remark --copy/paste the code from "cores" and modify it so that this problem is solved. Its principle is actually very simple : "first compute the degree of all the vertices in your graph and keep them in a table for quick access. For as long as there is a vertex of degree 1, remove it and update the degree of its neighbors in the table." You would only have to say instead "for as long as there is a vertex of degree 1, delete both vertices and update the degree of its distance-2 neighbors." This "cores" function is not "so very smart", but it is very small and efficient with memory.

By the way there is a set method to remove an element regardless of whether it is already inside or not : set.discard

I have also updated the tests in the tsp function, translating sentences from french to english ;-)

Oh my god, some tests were written in french ? :-D

Thanks for your help Nathann, and congratulation for the CNRS ranking !!!

Thaaaaaaaaaaaaaaaaaaaaaaaaaanks ! :-D

Nathann

comment:9 Changed 9 years ago by dcoudert

This new version should better fit your expectations.

comment:10 Changed 9 years ago by dcoudert

Removal of some trailing white spaces identified by patchbot.

comment:11 Changed 9 years ago by ncohen

Helloooooooooooo !!!

Here is what this patch does :

  • avoid copying G for nothing when the reduction rules are not used (which is the case by default)
  • uses the graph's structure instead of doing the same job with dictionaries
  • several caches or small optimizations, or shortenings.
  • a small amount of comments in the code
  • moved a whole block of code back to the original indentation level (the main code was indented because of a test "if g.order() == 0"...)

This being said, since your last version the "good sides" of the "cores" algorithm are used ! The list of vertices of small degree is not rebuilt n times but generated on the fly, so that only vertices whose degree has changed are checked :-)

Nathann

Changed 9 years ago by dcoudert

comment:12 Changed 9 years ago by dcoudert

Thanks for the review.

Note that the file name of your review patch is incorrect. Should be trac_12743-review.patch and not trac_12734-review.patch. I don't know how to change the file name.

I have corrected some mistakes in the reduction rules (some forgotten vertices).
For me the patch is OK.

Changed 9 years ago by ncohen

comment:13 Changed 9 years ago by ncohen

  • Description modified (diff)

Patch updated !

You were right to fix this line... In order to avoid it, I added a line to the doctests in this patch. If all is fine by you, you can set the ticket to positive review ! :-)

Nathann

Changed 9 years ago by ncohen

comment:14 Changed 9 years ago by dcoudert

  • Status changed from needs_review to positive_review

All is fine for me.

Thanks!

comment:15 Changed 9 years ago by dcoudert

  • Status changed from positive_review to needs_work

Arghh!!! I found a small bug: with the contraction of some vertices, we may create multi-edges, thus leading to an error since the degree is different from the size of the neighborhood. This is a side effect of the merge_vertices method because my graph has weighted edges.

So far, my first option is instead to use a copy of the graph to make my own copy without edge labels. This is certainly not the best solution...

A smarter solution is more than welcome.

Last edited 9 years ago by dcoudert (previous) (diff)

comment:16 Changed 9 years ago by dcoudert

  • Description modified (diff)
  • Status changed from needs_work to needs_review

I did the following:

  • Change the way the copy of the graph is performed to work with a copy without labels (if any)
  • Merge all patch into a single one to ease manipulations.

This patch is now working properly but needs an extra round of review.

I did some tests on large graphs (maps of the Internet by CAIDA) and the MILP algorithm returns the vertex cover in less than a minute (the graph has 36.000 nodes), but the Cliquer algorithm failed. On smaller instances (20.000 nodes), the running time of Cliquer is significantly reduced when using the reduction rules, but the improvement is unclear with the MILP. For small graphs, Cliquer is generally faster than MILP.

comment:17 follow-up: Changed 9 years ago by dcoudert

update of the patch after lots of email exchange with Nathann. I hope it is better now.

Last edited 9 years ago by dcoudert (previous) (diff)

comment:18 in reply to: ↑ 17 Changed 9 years ago by ncohen

update of the patch after lots of email exchange with Nathann. I hope it is better now.

Well.. Sounds close to perfection :-)

I noticed only one detail : the documentation says that reduction_rules are disabled by default while it is set to True by default :-)

Nathann

comment:19 Changed 9 years ago by dcoudert

I did the change.

I also did an extra round of tests. It appears that reduction rules are especially useful when using "Cliquer" or the ILP solver "GLPK". However, it is not so clear when using CPLEX (it is sometimes even slower...).

I can either let the patch as it is, or set default to False, or add a sentence like: "In some cases, it is faster to disallow reduction rules, in particular when using cplex.".

comment:20 Changed 9 years ago by ncohen

A sentence to explain that is may be wise or not to use the reduction rules depending on the instance would be nice indeed (both in vertex_cover and in independent_set). And of course make them enabled by default if you find that it is faster this way :-)

Nathann

comment:21 Changed 9 years ago by dcoudert

Is this OK ?

comment:22 Changed 9 years ago by ncohen

Hellooooooooo David !!

Well, the patch seems good to go, but there is one thing that should be changed : one of the two tests deals with randomGNP graphs, and tests the algorithms on instances of graphs.RandomGNP(20,.3).

Now the problem is that this actually does not test the reductions rules :

sage: all(min(graphs.RandomGNP(50,.3).degree())>2 for i in range(100))
True

Could you change the .3 to something like 4/50 ? It looks better this way :

sage: graphs.RandomGNP(50,4/50).cores(k=2)                            
([33, 0, 1, 18, 28, 42, 43], [23, 27, 38, 40, 44, 2, 4, 7, 15, 22, 24, 29, 35, 5, 8, 9, 10, 12, 16, 17, 19, 26, 30, 31, 39, 48, 3, 20, 37, 41, 45, 47, 6, 46, 32, 21, 14, 49, 36, 34, 13, 11, 25])
sage: graphs.RandomGNP(50,4/50).cores(k=2)
([16, 30, 0, 2, 4, 38, 6], [5, 32, 37, 41, 31, 49, 10, 13, 14, 17, 19, 21, 25, 28, 7, 33, 35, 36, 42, 45, 9, 20, 3, 11, 15, 1, 23, 26, 27, 39, 46, 8, 12, 29, 34, 44, 47, 48, 43, 18, 24, 40, 22])
sage: graphs.RandomGNP(50,4/50).cores(k=2)
([18, 42, 4, 31, 32], [5, 11, 12, 22, 24, 37, 14, 13, 3, 27, 28, 34, 35, 36, 39, 1, 8, 10, 15, 17, 23, 25, 38, 41, 46, 49, 6, 21, 29, 33, 40, 45, 47, 30, 2, 26, 0, 43, 7, 9, 16, 19, 20, 44, 48])
sage: graphs.RandomGNP(50,4/50).cores(k=2)
([0, 11, 22, 32, 3, 15, 49, 45, 29], [18, 26, 28, 35, 40, 43, 44, 5, 13, 27, 20, 24, 25, 8, 2, 30, 34, 36, 39, 41, 46, 47, 9, 7, 10, 12, 17, 19, 21, 42, 48, 1, 16, 23, 31, 37, 14, 33, 4, 6, 38])

So the rules are actually being tested :-)

You can set the ticket to "positive review" afterwards, for the ticket is good to go :-)

Thank for this patch !

Nathann

Changed 9 years ago by dcoudert

comment:23 follow-up: Changed 9 years ago by dcoudert

  • Status changed from needs_review to positive_review

Actually reduction rules were already tested with trees ;-)

I have modified the RandomGNP test and all tests pass on my computer.

Thank for the review.

comment:24 in reply to: ↑ 23 ; follow-up: Changed 9 years ago by ncohen

Actually reduction rules were already tested with trees ;-)

Yeah yeah, but only the rule with vertices of degree 1, so... :-)

I have modified the RandomGNP test and all tests pass on my computer.

:-)

Nathann

comment:25 in reply to: ↑ 24 ; follow-up: Changed 9 years ago by dcoudert

Replying to ncohen:

Actually reduction rules were already tested with trees ;-)

Yeah yeah, but only the rule with vertices of degree 1, so... :-)

Not necessarily since there is no correlation between the vertices in the set of vertices of degree at most 2 and the degree of the vertices. So all rules can be used.

comment:26 in reply to: ↑ 25 Changed 9 years ago by ncohen

Not necessarily since there is no correlation between the vertices in the set of vertices of degree at most 2 and the degree of the vertices. So all rules can be used.

Oh right ! I was thinking of the ordering given by "cores" again. Well, at least the rule that removes a vertex of degree two whose neighbors are adjacent certainly was not tested there :-D

Nathann

comment:27 Changed 9 years ago by jdemeyer

  • Reviewers set to Nathann Cohen

comment:28 Changed 9 years ago by jdemeyer

  • Milestone changed from sage-5.0 to sage-5.1

comment:29 Changed 9 years ago by jdemeyer

  • Merged in set to sage-5.1.beta0
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.