Opened 4 years ago
Last modified 6 days ago
#23798 needs_work defect
Fractional Chromatic Index test fails with GLPK
Reported by:  jdemeyer  Owned by:  

Priority:  major  Milestone:  sage9.4 
Component:  graph theory  Keywords:  
Cc:  dcoudert  Merged in:  
Authors:  David Coudert  Reviewers:  Dima Pasechnik 
Report Upstream:  N/A  Work issues:  
Branch:  public/graphs/23798_fractional_chromatic_index (Commits, GitHub, GitLab)  Commit:  ebcde7c37ea3a8377fbcebe2246aae890e9df305 
Dependencies:  Stopgaps: 
Description (last modified by )
Change History (37)
comment:1 Changed 4 years ago by
 Description modified (diff)
comment:2 Changed 4 years ago by
 Description modified (diff)
 Summary changed from Fractional Chromatic Index Infinite Loop fails with GLPK to Fractional Chromatic Index test fails with GLPK
comment:3 followup: ↓ 23 Changed 4 years ago by
comment:4 Changed 4 years ago by
You could also forbid using a nonexact solver for this problem.
comment:5 Changed 4 years ago by
Sure, we can force PPL
, but it is way slower (can sometimes be faster on small graphs).
sage: G = graphs.Grid2dGraph(6,6) sage: %time G.fractional_chromatic_index(solver='GLPK') CPU times: user 43.4 ms, sys: 4.9 ms, total: 48.3 ms Wall time: 52.1 ms 4.0 sage: %time G.fractional_chromatic_index(solver='PPL') CPU times: user 1min 11s, sys: 256 ms, total: 1min 11s Wall time: 1min 12s 4
I agree that using a tolerance gap is not a nice solution either.
comment:6 Changed 4 years ago by
 Branch set to u/dcoudert/23798
 Commit set to 74850077e305024907037e4094f5956ef5a59e11
 Status changed from new to needs_review
I don't see better solution than making PPL
the default solver here.
New commits:
7485007  trac #23798: set PPL has default solver

comment:7 Changed 4 years ago by
 Status changed from needs_review to needs_work
"Be aware that this method may loop endlessly when using some non exact solvers on 32bits". I doubt that this is problem specific to 32 bits. The wording seems to imply that it's safe to use nonexact solvers on 64bit machines.
comment:8 Changed 4 years ago by
Also, this isn't quite correct:
Tickets :trac:`23658` and :trac:`23798` are fixed::
followed by a test with GLPK.
comment:9 Changed 4 years ago by
 Commit changed from 74850077e305024907037e4094f5956ef5a59e11 to 910fb839eb4612f42bf61858d0f0725fb1f2559c
Branch pushed to git repo; I updated commit sha1. New commits:
910fb83  trac #23798: reviewers comments

comment:10 Changed 4 years ago by
 Status changed from needs_work to needs_review
Is this more appropriate ?
comment:11 Changed 4 years ago by
Well, it depends. Do you consider the code here to be a fix or a workaround? I am asking because you need to decide what to do with
sage: g.fractional_chromatic_index(solver='GLPK') # known bug (#23798)
You cannot say that this ticket is a known bug while at the same time fixing this ticket.
comment:12 Changed 4 years ago by
 Status changed from needs_review to needs_work
comment:13 followup: ↓ 15 Changed 4 years ago by
The problem is not fixed. That's why I changed the text to Issue reported in :trac:`23658` and :trac:`23798` with non exact solvers::
. What else can I write to be more correct/specific?
comment:14 Changed 4 years ago by
 Branch u/dcoudert/23798 deleted
 Commit 910fb839eb4612f42bf61858d0f0725fb1f2559c deleted
 Description modified (diff)
comment:15 in reply to: ↑ 13 Changed 4 years ago by
comment:16 Changed 4 years ago by
OK, thanks.
comment:17 followup: ↓ 21 Changed 11 months ago by
 Milestone changed from sage8.1 to sage9.3
Since #24824, we use GLPK 4.65. Does anyone with access to a 32bit machine still see the bug ?
comment:18 Changed 11 months ago by
 Milestone changed from sage9.3 to sage9.2
comment:19 Changed 9 months ago by
 Milestone changed from sage9.2 to sage9.3
comment:20 Changed 5 months ago by
 Milestone changed from sage9.3 to sage9.4
Setting new milestone based on a cursory review of ticket status, priority, and last modification date.
comment:21 in reply to: ↑ 17 Changed 2 weeks ago by
comment:22 Changed 13 days ago by
This is unfortunate.
The only solutions I see are:
 Force to use
PPL
, but this is not nice for users with a 64 bits machine (most of the users I guess)  Raise an error when the solver is
glpk
on a 32 bits machine
and none of them are satisfactory.
comment:23 in reply to: ↑ 3 Changed 13 days ago by
Replying to dcoudert:
I suspect that we need to change
if M.solve(log = verbose) <= 1:
toif M.solve(log = verbose) <= 1 + tol:
, wheretol = 0 if solver=='PPL' else 1e6
. I don't like this solution, but I don't know what else we can do.
Using a tolerance is exactly the right solution. The test for exact <= 1
and == 1
is meaningless with a numerical LP solver. LP solvers use perturbations systematically. It is not a bug if the result is not an exact integer.
comment:24 Changed 13 days ago by
See also my explanations in https://trac.sagemath.org/ticket/30635#comment:20 and following.
comment:25 followup: ↓ 27 Changed 12 days ago by
there are two LPs involved, one of them for a maximum weight matching, something that can be instead done by a combinatorial algorithm, see e.g. Blossom V in http://pub.ist.ac.at/~vnk/software.html
comment:26 Changed 12 days ago by
If I force PPL on the inner (matching) LP:

src/sage/graphs/graph_coloring.pyx
a b def fractional_chromatic_index(G, solver="PPL", verbose_constraints=False, verbo 825 825 frozen_edges = [frozenset(e) for e in G.edges(labels=False, sort=False)] 826 826 827 827 # Initialize LP for maximum weight matching 828 M = MixedIntegerLinearProgram(solver= solver, constraint_generation=True)828 M = MixedIntegerLinearProgram(solver="PPL", constraint_generation=True) 829 829 830 830 # One variable per edge 831 831 b = M.new_variable(binary=True, nonnegative=True)
then on a 32bit system it's all fine (GLPK from the system, unpatched, so these extra messages)
sage: G=graphs.PetersenGraph() sage: G.fractional_chromatic_index(solver="GLPK") Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used 3.0
comment:27 in reply to: ↑ 25 ; followup: ↓ 34 Changed 12 days ago by
 Branch set to public/graphs/23798_fractional_chromatic_index
 Commit set to ebcde7c37ea3a8377fbcebe2246aae890e9df305
 Status changed from needs_work to needs_review
Following above discussion, I added a tolerance gap for numerical LP solvers.
Note that we can use the networkx
implementation of the blossom algorithm via the matching
method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...
New commits:
ebcde7c  trac #23798: add tolerance gap for numerical LP solvers

comment:28 Changed 12 days ago by
I don’t like this approach. Without explicit guarantees that these tolerances are correct, it is replacing correct algorithms with heuristics.
comment:29 Changed 12 days ago by
matching = [fe for fe in frozen_edges if M.get_values(b[fe]) == 1]
This line also needs changing because the test "== 1" is not robust.
comment:30 Changed 12 days ago by
I don’t see how one can make the oracle (the inner LP) inexact, without potentially returning a very wrong answer.
The oracle checks that there is no maximum weight matching of weight >1. Say, we let it error by epsilon, i.e we terminate with oracle returning 1+epsilon. Potentially, there could be K maximum matchings with this weight, if they are disjoint this means that the final error is K times epsilon, oops…
comment:31 Changed 12 days ago by
 Reviewers set to Dima Pasechnik
 Status changed from needs_review to needs_work
comment:32 followup: ↓ 33 Changed 11 days ago by
I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.
comment:33 in reply to: ↑ 32 Changed 11 days ago by
Replying to dcoudert:
I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.
No, my point is that without a special analysis it's not possible to argue that solving the oracle problem (with noninteger objective function) inexactly provides a correct result, even if you "correctly" round 0.9999... to 1. It's because a small oracle error may get amplified a lot in the main LP. Welcome to floating point hell :)
comment:34 in reply to: ↑ 27 Changed 10 days ago by
Replying to dcoudert:
Following above discussion, I added a tolerance gap for numerical LP solvers.
Note that we can use the
networkx
implementation of the blossom algorithm via thematching
method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...
The oracle implementation here is naive, and bound to get very slow; it's integer LP without Edmonds' constraints, instead of a "normal" LP over the matching polytope with Edmonds' constraints (aka blossom inequalities). So this would need yet another oracle (as there are too exponentially many inequalities there), but well, it's polynomial time then. The generated constraints can stay, so this should be fast.
New commits:
ebcde7c trac #23798: add tolerance gap for numerical LP solvers
comment:35 Changed 6 days ago by
I took a quick look at the function now. I would suggest the following changes:
 Before adding a new constraint to the master problem, verify that
matching
is indeed a matching. In this way, the master problem will always be a correct relaxation, even if an inexact oracle is used.
 When the numerical solver that is used for solving the separation problem does not find a matching of value greater than 1 + epsilon, you can switch to PPL  then, with a bit of luck, it can prove the bound <= 1.
 It will make sense to have separate parameters for the solver used for the master problem and the one(s) used for the separation problem.
comment:36 Changed 6 days ago by
Actually, it seems that even with PPL, the code is just wrong, as PPL does not do MILP, it only does LP, right?
comment:37 Changed 6 days ago by
The PPL does have a (very limited) MIP solver.
I suspect that we need to change
if M.solve(log = verbose) <= 1:
toif M.solve(log = verbose) <= 1 + tol:
, wheretol = 0 if solver=='PPL' else 1e6
. I don't like this solution, but I don't know what else we can do.I don't have access to a 32bit machine and so cannot test.