Opened 5 years ago
Last modified 3 months ago
#23798 needs_work defect
Fractional Chromatic Index test fails with GLPK
Reported by:  Jeroen Demeyer  Owned by:  

Priority:  major  Milestone:  sage9.8 
Component:  graph theory  Keywords:  
Cc:  David Coudert  Merged in:  
Authors:  David Coudert  Reviewers:  Dima Pasechnik 
Report Upstream:  N/A  Work issues:  
Branch:  public/graphs/23798_fractional_chromatic_index (Commits, GitHub, GitLab)  Commit:  43e8873f6ac619c93fe8f8b5b79b7c8060cfeb9b 
Dependencies:  Stopgaps: 
Description (last modified by )
Change History (43)
comment:1 Changed 5 years ago by
Description:  modified (diff) 

comment:2 Changed 5 years ago by
Description:  modified (diff) 

Summary:  Fractional Chromatic Index Infinite Loop fails with GLPK → Fractional Chromatic Index test fails with GLPK 
comment:3 followup: 23 Changed 5 years ago by
comment:5 Changed 5 years ago by
Sure, we can force PPL
, but it is way slower (can sometimes be faster on small graphs).
sage: G = graphs.Grid2dGraph(6,6) sage: %time G.fractional_chromatic_index(solver='GLPK') CPU times: user 43.4 ms, sys: 4.9 ms, total: 48.3 ms Wall time: 52.1 ms 4.0 sage: %time G.fractional_chromatic_index(solver='PPL') CPU times: user 1min 11s, sys: 256 ms, total: 1min 11s Wall time: 1min 12s 4
I agree that using a tolerance gap is not a nice solution either.
comment:6 Changed 5 years ago by
Authors:  → David Coudert 

Branch:  → u/dcoudert/23798 
Commit:  → 74850077e305024907037e4094f5956ef5a59e11 
Status:  new → needs_review 
I don't see better solution than making PPL
the default solver here.
New commits:
7485007  trac #23798: set PPL has default solver

comment:7 Changed 5 years ago by
Status:  needs_review → needs_work 

"Be aware that this method may loop endlessly when using some non exact solvers on 32bits". I doubt that this is problem specific to 32 bits. The wording seems to imply that it's safe to use nonexact solvers on 64bit machines.
comment:8 Changed 5 years ago by
Also, this isn't quite correct:
Tickets :trac:`23658` and :trac:`23798` are fixed::
followed by a test with GLPK.
comment:9 Changed 5 years ago by
Commit:  74850077e305024907037e4094f5956ef5a59e11 → 910fb839eb4612f42bf61858d0f0725fb1f2559c 

Branch pushed to git repo; I updated commit sha1. New commits:
910fb83  trac #23798: reviewers comments

comment:11 Changed 5 years ago by
Well, it depends. Do you consider the code here to be a fix or a workaround? I am asking because you need to decide what to do with
sage: g.fractional_chromatic_index(solver='GLPK') # known bug (#23798)
You cannot say that this ticket is a known bug while at the same time fixing this ticket.
comment:12 Changed 5 years ago by
Status:  needs_review → needs_work 

comment:13 followup: 15 Changed 5 years ago by
The problem is not fixed. That's why I changed the text to Issue reported in :trac:`23658` and :trac:`23798` with non exact solvers::
. What else can I write to be more correct/specific?
comment:14 Changed 5 years ago by
Authors:  David Coudert 

Branch:  u/dcoudert/23798 
Commit:  910fb839eb4612f42bf61858d0f0725fb1f2559c 
Description:  modified (diff) 
comment:15 Changed 5 years ago by
comment:17 followup: 21 Changed 2 years ago by
Milestone:  sage8.1 → sage9.3 

Since #24824, we use GLPK 4.65. Does anyone with access to a 32bit machine still see the bug ?
comment:18 Changed 2 years ago by
Milestone:  sage9.3 → sage9.2 

comment:19 Changed 2 years ago by
Milestone:  sage9.2 → sage9.3 

comment:20 Changed 22 months ago by
Milestone:  sage9.3 → sage9.4 

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.
comment:21 Changed 17 months ago by
comment:22 Changed 17 months ago by
This is unfortunate.
The only solutions I see are:
 Force to use
PPL
, but this is not nice for users with a 64 bits machine (most of the users I guess)  Raise an error when the solver is
glpk
on a 32 bits machine
and none of them are satisfactory.
comment:23 Changed 17 months ago by
Replying to dcoudert:
I suspect that we need to change
if M.solve(log = verbose) <= 1:
toif M.solve(log = verbose) <= 1 + tol:
, wheretol = 0 if solver=='PPL' else 1e6
. I don't like this solution, but I don't know what else we can do.
Using a tolerance is exactly the right solution. The test for exact <= 1
and == 1
is meaningless with a numerical LP solver. LP solvers use perturbations systematically. It is not a bug if the result is not an exact integer.
comment:24 Changed 17 months ago by
See also my explanations in https://trac.sagemath.org/ticket/30635#comment:20 and following.
comment:25 followup: 27 Changed 17 months ago by
there are two LPs involved, one of them for a maximum weight matching, something that can be instead done by a combinatorial algorithm, see e.g. Blossom V in http://pub.ist.ac.at/~vnk/software.html
comment:26 Changed 17 months ago by
If I force PPL on the inner (matching) LP:

src/sage/graphs/graph_coloring.pyx
a b def fractional_chromatic_index(G, solver="PPL", verbose_constraints=False, verbo 825 825 frozen_edges = [frozenset(e) for e in G.edges(labels=False, sort=False)] 826 826 827 827 # Initialize LP for maximum weight matching 828 M = MixedIntegerLinearProgram(solver= solver, constraint_generation=True)828 M = MixedIntegerLinearProgram(solver="PPL", constraint_generation=True) 829 829 830 830 # One variable per edge 831 831 b = M.new_variable(binary=True, nonnegative=True)
then on a 32bit system it's all fine (GLPK from the system, unpatched, so these extra messages)
sage: G=graphs.PetersenGraph() sage: G.fractional_chromatic_index(solver="GLPK") Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used Longstep dual simplex will be used 3.0
comment:27 followup: 34 Changed 17 months ago by
Authors:  → David Coudert 

Branch:  → public/graphs/23798_fractional_chromatic_index 
Commit:  → ebcde7c37ea3a8377fbcebe2246aae890e9df305 
Status:  needs_work → needs_review 
Following above discussion, I added a tolerance gap for numerical LP solvers.
Note that we can use the networkx
implementation of the blossom algorithm via the matching
method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...
New commits:
ebcde7c  trac #23798: add tolerance gap for numerical LP solvers

comment:28 Changed 17 months ago by
I don’t like this approach. Without explicit guarantees that these tolerances are correct, it is replacing correct algorithms with heuristics.
comment:29 Changed 17 months ago by
matching = [fe for fe in frozen_edges if M.get_values(b[fe]) == 1]
This line also needs changing because the test "== 1" is not robust.
comment:30 Changed 17 months ago by
I don’t see how one can make the oracle (the inner LP) inexact, without potentially returning a very wrong answer.
The oracle checks that there is no maximum weight matching of weight >1. Say, we let it error by epsilon, i.e we terminate with oracle returning 1+epsilon. Potentially, there could be K maximum matchings with this weight, if they are disjoint this means that the final error is K times epsilon, oops…
comment:31 Changed 17 months ago by
Reviewers:  → Dima Pasechnik 

Status:  needs_review → needs_work 
comment:32 followup: 33 Changed 17 months ago by
I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.
comment:33 Changed 17 months ago by
Replying to dcoudert:
I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.
No, my point is that without a special analysis it's not possible to argue that solving the oracle problem (with noninteger objective function) inexactly provides a correct result, even if you "correctly" round 0.9999... to 1. It's because a small oracle error may get amplified a lot in the main LP. Welcome to floating point hell :)
comment:34 Changed 17 months ago by
Replying to dcoudert:
Following above discussion, I added a tolerance gap for numerical LP solvers.
Note that we can use the
networkx
implementation of the blossom algorithm via thematching
method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...
The oracle implementation here is naive, and bound to get very slow; it's integer LP without Edmonds' constraints, instead of a "normal" LP over the matching polytope with Edmonds' constraints (aka blossom inequalities). So this would need yet another oracle (as there are too exponentially many inequalities there), but well, it's polynomial time then. The generated constraints can stay, so this should be fast.
New commits:
ebcde7c trac #23798: add tolerance gap for numerical LP solvers
comment:35 Changed 17 months ago by
I took a quick look at the function now. I would suggest the following changes:
 Before adding a new constraint to the master problem, verify that
matching
is indeed a matching. In this way, the master problem will always be a correct relaxation, even if an inexact oracle is used.
 When the numerical solver that is used for solving the separation problem does not find a matching of value greater than 1 + epsilon, you can switch to PPL  then, with a bit of luck, it can prove the bound <= 1.
 It will make sense to have separate parameters for the solver used for the master problem and the one(s) used for the separation problem.
comment:36 Changed 17 months ago by
Actually, it seems that even with PPL, the code is just wrong, as PPL does not do MILP, it only does LP, right?
comment:38 Changed 16 months ago by
Milestone:  sage9.4 → sage9.5 

comment:39 Changed 13 months ago by
Commit:  ebcde7c37ea3a8377fbcebe2246aae890e9df305 → 43e8873f6ac619c93fe8f8b5b79b7c8060cfeb9b 

comment:40 Changed 13 months ago by
I tried the ideas from #comment:35. I have let some code for debugging as the code may loop forever when using GLPK
for both master and separation problems. The patchbot will complain...
We should search for another method not relying on LP solvers, if any...
comment:41 Changed 11 months ago by
Milestone:  sage9.5 → sage9.6 

comment:42 Changed 8 months ago by
Milestone:  sage9.6 → sage9.7 

comment:43 Changed 3 months ago by
Milestone:  sage9.7 → sage9.8 

I suspect that we need to change
if M.solve(log = verbose) <= 1:
toif M.solve(log = verbose) <= 1 + tol:
, wheretol = 0 if solver=='PPL' else 1e6
. I don't like this solution, but I don't know what else we can do.I don't have access to a 32bit machine and so cannot test.