Opened 5 years ago

Last modified 3 months ago

#23798 needs_work defect

Fractional Chromatic Index test fails with GLPK

Reported by: Jeroen Demeyer Owned by:
Priority: major Milestone: sage-9.8
Component: graph theory Keywords:
Cc: David Coudert Merged in:
Authors: David Coudert Reviewers: Dima Pasechnik
Report Upstream: N/A Work issues:
Branch: public/graphs/23798_fractional_chromatic_index (Commits, GitHub, GitLab) Commit: 43e8873f6ac619c93fe8f8b5b79b7c8060cfeb9b
Dependencies: Stopgaps:

Status badges

Description (last modified by Jeroen Demeyer)

The test

            sage: g = graphs.PetersenGraph()
            sage: g.fractional_chromatic_index(solver='GLPK')
            3.0

added in src/sage/graphs/graph.py by #23658 fails with GLPK-4.63 on 32-bit.

As a workaround, we use PPL by default in #24099.

Change History (43)

comment:1 Changed 5 years ago by Jeroen Demeyer

Description: modified (diff)

comment:2 Changed 5 years ago by Jeroen Demeyer

Description: modified (diff)
Summary: Fractional Chromatic Index Infinite Loop fails with GLPKFractional Chromatic Index test fails with GLPK

comment:3 Changed 5 years ago by David Coudert

I suspect that we need to change if M.solve(log = verbose) <= 1: to if M.solve(log = verbose) <= 1 + tol:, where tol = 0 if solver=='PPL' else 1e-6. I don't like this solution, but I don't know what else we can do.

I don't have access to a 32-bit machine and so cannot test.

comment:4 Changed 5 years ago by Jeroen Demeyer

You could also forbid using a non-exact solver for this problem.

comment:5 Changed 5 years ago by David Coudert

Sure, we can force PPL, but it is way slower (can sometimes be faster on small graphs).

sage: G = graphs.Grid2dGraph(6,6)
sage: %time G.fractional_chromatic_index(solver='GLPK')
CPU times: user 43.4 ms, sys: 4.9 ms, total: 48.3 ms
Wall time: 52.1 ms
4.0
sage: %time G.fractional_chromatic_index(solver='PPL')
CPU times: user 1min 11s, sys: 256 ms, total: 1min 11s
Wall time: 1min 12s
4

I agree that using a tolerance gap is not a nice solution either.

comment:6 Changed 5 years ago by David Coudert

Authors: David Coudert
Branch: u/dcoudert/23798
Commit: 74850077e305024907037e4094f5956ef5a59e11
Status: newneeds_review

I don't see better solution than making PPL the default solver here.


New commits:

7485007trac #23798: set PPL has default solver

comment:7 Changed 5 years ago by Jeroen Demeyer

Status: needs_reviewneeds_work

"Be aware that this method may loop endlessly when using some non exact solvers on 32-bits". I doubt that this is problem specific to 32 bits. The wording seems to imply that it's safe to use non-exact solvers on 64-bit machines.

comment:8 Changed 5 years ago by Jeroen Demeyer

Also, this isn't quite correct:

Tickets :trac:`23658` and :trac:`23798` are fixed::

followed by a test with GLPK.

comment:9 Changed 5 years ago by git

Commit: 74850077e305024907037e4094f5956ef5a59e11910fb839eb4612f42bf61858d0f0725fb1f2559c

Branch pushed to git repo; I updated commit sha1. New commits:

910fb83trac #23798: reviewers comments

comment:10 Changed 5 years ago by David Coudert

Status: needs_workneeds_review

Is this more appropriate ?

comment:11 Changed 5 years ago by Jeroen Demeyer

Well, it depends. Do you consider the code here to be a fix or a workaround? I am asking because you need to decide what to do with

sage: g.fractional_chromatic_index(solver='GLPK') # known bug (#23798)

You cannot say that this ticket is a known bug while at the same time fixing this ticket.

comment:12 Changed 5 years ago by Jeroen Demeyer

Status: needs_reviewneeds_work

comment:13 Changed 5 years ago by David Coudert

The problem is not fixed. That's why I changed the text to Issue reported in :trac:`23658` and :trac:`23798` with non exact solvers::. What else can I write to be more correct/specific?

comment:14 Changed 5 years ago by Jeroen Demeyer

Authors: David Coudert
Branch: u/dcoudert/23798
Commit: 910fb839eb4612f42bf61858d0f0725fb1f2559c
Description: modified (diff)

comment:15 in reply to:  13 Changed 5 years ago by Jeroen Demeyer

Replying to dcoudert:

The problem is not fixed.

Then I'm moving your branch to a new ticket: #24099.

comment:16 Changed 5 years ago by David Coudert

OK, thanks.

comment:17 Changed 2 years ago by David Coudert

Milestone: sage-8.1sage-9.3

Since #24824, we use GLPK 4.65. Does anyone with access to a 32-bit machine still see the bug ?

comment:18 Changed 2 years ago by David Coudert

Milestone: sage-9.3sage-9.2

comment:19 Changed 2 years ago by Matthias Köppe

Milestone: sage-9.2sage-9.3

comment:20 Changed 22 months ago by Matthias Köppe

Milestone: sage-9.3sage-9.4

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

comment:21 in reply to:  17 Changed 17 months ago by Dave Morris

Replying to dcoudert:

Since #24824, we use GLPK 4.65. Does anyone with access to a 32-bit machine still see the bug ?

I still see the bug (on a 32-bit debian virtual machine). The default solver seems instantaneous, but I let solver='GLPK' run for about 15 minutes and did not get an answer.

comment:22 Changed 17 months ago by David Coudert

This is unfortunate.

The only solutions I see are:

  • Force to use PPL, but this is not nice for users with a 64 bits machine (most of the users I guess)
  • Raise an error when the solver is glpk on a 32 bits machine

and none of them are satisfactory.

comment:23 in reply to:  3 Changed 17 months ago by Matthias Köppe

Replying to dcoudert:

I suspect that we need to change if M.solve(log = verbose) <= 1: to if M.solve(log = verbose) <= 1 + tol:, where tol = 0 if solver=='PPL' else 1e-6. I don't like this solution, but I don't know what else we can do.

Using a tolerance is exactly the right solution. The test for exact <= 1 and == 1 is meaningless with a numerical LP solver. LP solvers use perturbations systematically. It is not a bug if the result is not an exact integer.

comment:24 Changed 17 months ago by Matthias Köppe

See also my explanations in https://trac.sagemath.org/ticket/30635#comment:20 and following.

comment:25 Changed 17 months ago by Dima Pasechnik

there are two LPs involved, one of them for a maximum weight matching, something that can be instead done by a combinatorial algorithm, see e.g. Blossom V in http://pub.ist.ac.at/~vnk/software.html

comment:26 Changed 17 months ago by Dima Pasechnik

If I force PPL on the inner (matching) LP:

  • src/sage/graphs/graph_coloring.pyx

    a b def fractional_chromatic_index(G, solver="PPL", verbose_constraints=False, verbo 
    825825    frozen_edges = [frozenset(e) for e in G.edges(labels=False, sort=False)]
    826826
    827827    # Initialize LP for maximum weight matching
    828     M = MixedIntegerLinearProgram(solver=solver, constraint_generation=True)
     828    M = MixedIntegerLinearProgram(solver="PPL", constraint_generation=True)
    829829
    830830    # One variable per edge
    831831    b = M.new_variable(binary=True, nonnegative=True)

then on a 32-bit system it's all fine (GLPK from the system, unpatched, so these extra messages)

sage: G=graphs.PetersenGraph()
sage: G.fractional_chromatic_index(solver="GLPK")
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
3.0

comment:27 in reply to:  25 ; Changed 17 months ago by David Coudert

Authors: David Coudert
Branch: public/graphs/23798_fractional_chromatic_index
Commit: ebcde7c37ea3a8377fbcebe2246aae890e9df305
Status: needs_workneeds_review

Following above discussion, I added a tolerance gap for numerical LP solvers.

Note that we can use the networkx implementation of the blossom algorithm via the matching method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...


New commits:

ebcde7ctrac #23798: add tolerance gap for numerical LP solvers

comment:28 Changed 17 months ago by Dima Pasechnik

I don’t like this approach. Without explicit guarantees that these tolerances are correct, it is replacing correct algorithms with heuristics.

comment:29 Changed 17 months ago by Matthias Köppe

         matching = [fe for fe in frozen_edges if M.get_values(b[fe]) == 1]

This line also needs changing because the test "== 1" is not robust.

comment:30 Changed 17 months ago by Dima Pasechnik

I don’t see how one can make the oracle (the inner LP) inexact, without potentially returning a very wrong answer.

The oracle checks that there is no maximum weight matching of weight >1. Say, we let it error by epsilon, i.e we terminate with oracle returning 1+epsilon. Potentially, there could be K maximum matchings with this weight, if they are disjoint this means that the final error is K times epsilon, oops…

Last edited 17 months ago by Dima Pasechnik (previous) (diff)

comment:31 Changed 17 months ago by Dima Pasechnik

Reviewers: Dima Pasechnik
Status: needs_reviewneeds_work

comment:32 Changed 17 months ago by David Coudert

I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.

comment:33 in reply to:  32 Changed 17 months ago by Dima Pasechnik

Replying to dcoudert:

I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.

No, my point is that without a special analysis it's not possible to argue that solving the oracle problem (with non-integer objective function) inexactly provides a correct result, even if you "correctly" round 0.9999... to 1. It's because a small oracle error may get amplified a lot in the main LP. Welcome to floating point hell :-)|

comment:34 in reply to:  27 Changed 17 months ago by Dima Pasechnik

Replying to dcoudert:

Following above discussion, I added a tolerance gap for numerical LP solvers.

Note that we can use the networkx implementation of the blossom algorithm via the matching method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...

The oracle implementation here is naive, and bound to get very slow; it's integer LP without Edmonds' constraints, instead of a "normal" LP over the matching polytope with Edmonds' constraints (aka blossom inequalities). So this would need yet another oracle (as there are too exponentially many inequalities there), but well, it's polynomial time then. The generated constraints can stay, so this should be fast.


New commits:

ebcde7ctrac #23798: add tolerance gap for numerical LP solvers

comment:35 Changed 17 months ago by Matthias Köppe

I took a quick look at the function now. I would suggest the following changes:

  1. Before adding a new constraint to the master problem, verify that matching is indeed a matching. In this way, the master problem will always be a correct relaxation, even if an inexact oracle is used.
  1. When the numerical solver that is used for solving the separation problem does not find a matching of value greater than 1 + epsilon, you can switch to PPL - then, with a bit of luck, it can prove the bound <= 1.
  1. It will make sense to have separate parameters for the solver used for the master problem and the one(s) used for the separation problem.

comment:36 Changed 17 months ago by Dima Pasechnik

Actually, it seems that even with PPL, the code is just wrong, as PPL does not do MILP, it only does LP, right?

comment:37 Changed 17 months ago by Matthias Köppe

The PPL does have a (very limited) MIP solver.

comment:38 Changed 16 months ago by Matthias Köppe

Milestone: sage-9.4sage-9.5

comment:39 Changed 13 months ago by git

Commit: ebcde7c37ea3a8377fbcebe2246aae890e9df30543e8873f6ac619c93fe8f8b5b79b7c8060cfeb9b

Branch pushed to git repo; I updated commit sha1. New commits:

1926be5trac #23798: merged with 9.5.beta5
43e8873trac #23798: ideas from comment 35

comment:40 Changed 13 months ago by David Coudert

I tried the ideas from #comment:35. I have let some code for debugging as the code may loop forever when using GLPK for both master and separation problems. The patchbot will complain...

We should search for another method not relying on LP solvers, if any...

comment:41 Changed 11 months ago by Matthias Köppe

Milestone: sage-9.5sage-9.6

comment:42 Changed 8 months ago by Matthias Köppe

Milestone: sage-9.6sage-9.7

comment:43 Changed 3 months ago by Matthias Köppe

Milestone: sage-9.7sage-9.8
Note: See TracTickets for help on using tickets.