Opened 4 years ago

Last modified 6 days ago

#23798 needs_work defect

Fractional Chromatic Index test fails with GLPK

Reported by: jdemeyer Owned by:
Priority: major Milestone: sage-9.4
Component: graph theory Keywords:
Cc: dcoudert Merged in:
Authors: David Coudert Reviewers: Dima Pasechnik
Report Upstream: N/A Work issues:
Branch: public/graphs/23798_fractional_chromatic_index (Commits, GitHub, GitLab) Commit: ebcde7c37ea3a8377fbcebe2246aae890e9df305
Dependencies: Stopgaps:

Status badges

Description (last modified by jdemeyer)

The test

            sage: g = graphs.PetersenGraph()
            sage: g.fractional_chromatic_index(solver='GLPK')
            3.0

added in src/sage/graphs/graph.py by #23658 fails with GLPK-4.63 on 32-bit.

As a workaround, we use PPL by default in #24099.

Change History (37)

comment:1 Changed 4 years ago by jdemeyer

  • Description modified (diff)

comment:2 Changed 4 years ago by jdemeyer

  • Description modified (diff)
  • Summary changed from Fractional Chromatic Index Infinite Loop fails with GLPK to Fractional Chromatic Index test fails with GLPK

comment:3 follow-up: Changed 4 years ago by dcoudert

I suspect that we need to change if M.solve(log = verbose) <= 1: to if M.solve(log = verbose) <= 1 + tol:, where tol = 0 if solver=='PPL' else 1e-6. I don't like this solution, but I don't know what else we can do.

I don't have access to a 32-bit machine and so cannot test.

comment:4 Changed 4 years ago by jdemeyer

You could also forbid using a non-exact solver for this problem.

comment:5 Changed 4 years ago by dcoudert

Sure, we can force PPL, but it is way slower (can sometimes be faster on small graphs).

sage: G = graphs.Grid2dGraph(6,6)
sage: %time G.fractional_chromatic_index(solver='GLPK')
CPU times: user 43.4 ms, sys: 4.9 ms, total: 48.3 ms
Wall time: 52.1 ms
4.0
sage: %time G.fractional_chromatic_index(solver='PPL')
CPU times: user 1min 11s, sys: 256 ms, total: 1min 11s
Wall time: 1min 12s
4

I agree that using a tolerance gap is not a nice solution either.

comment:6 Changed 4 years ago by dcoudert

  • Authors set to David Coudert
  • Branch set to u/dcoudert/23798
  • Commit set to 74850077e305024907037e4094f5956ef5a59e11
  • Status changed from new to needs_review

I don't see better solution than making PPL the default solver here.


New commits:

7485007trac #23798: set PPL has default solver

comment:7 Changed 4 years ago by jdemeyer

  • Status changed from needs_review to needs_work

"Be aware that this method may loop endlessly when using some non exact solvers on 32-bits". I doubt that this is problem specific to 32 bits. The wording seems to imply that it's safe to use non-exact solvers on 64-bit machines.

comment:8 Changed 4 years ago by jdemeyer

Also, this isn't quite correct:

Tickets :trac:`23658` and :trac:`23798` are fixed::

followed by a test with GLPK.

comment:9 Changed 4 years ago by git

  • Commit changed from 74850077e305024907037e4094f5956ef5a59e11 to 910fb839eb4612f42bf61858d0f0725fb1f2559c

Branch pushed to git repo; I updated commit sha1. New commits:

910fb83trac #23798: reviewers comments

comment:10 Changed 4 years ago by dcoudert

  • Status changed from needs_work to needs_review

Is this more appropriate ?

comment:11 Changed 4 years ago by jdemeyer

Well, it depends. Do you consider the code here to be a fix or a workaround? I am asking because you need to decide what to do with

sage: g.fractional_chromatic_index(solver='GLPK') # known bug (#23798)

You cannot say that this ticket is a known bug while at the same time fixing this ticket.

comment:12 Changed 4 years ago by jdemeyer

  • Status changed from needs_review to needs_work

comment:13 follow-up: Changed 4 years ago by dcoudert

The problem is not fixed. That's why I changed the text to Issue reported in :trac:`23658` and :trac:`23798` with non exact solvers::. What else can I write to be more correct/specific?

comment:14 Changed 4 years ago by jdemeyer

  • Authors David Coudert deleted
  • Branch u/dcoudert/23798 deleted
  • Commit 910fb839eb4612f42bf61858d0f0725fb1f2559c deleted
  • Description modified (diff)

comment:15 in reply to: ↑ 13 Changed 4 years ago by jdemeyer

Replying to dcoudert:

The problem is not fixed.

Then I'm moving your branch to a new ticket: #24099.

comment:16 Changed 4 years ago by dcoudert

OK, thanks.

comment:17 follow-up: Changed 11 months ago by dcoudert

  • Milestone changed from sage-8.1 to sage-9.3

Since #24824, we use GLPK 4.65. Does anyone with access to a 32-bit machine still see the bug ?

comment:18 Changed 11 months ago by dcoudert

  • Milestone changed from sage-9.3 to sage-9.2

comment:19 Changed 9 months ago by mkoeppe

  • Milestone changed from sage-9.2 to sage-9.3

comment:20 Changed 5 months ago by mkoeppe

  • Milestone changed from sage-9.3 to sage-9.4

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

comment:21 in reply to: ↑ 17 Changed 2 weeks ago by gh-DaveWitteMorris

Replying to dcoudert:

Since #24824, we use GLPK 4.65. Does anyone with access to a 32-bit machine still see the bug ?

I still see the bug (on a 32-bit debian virtual machine). The default solver seems instantaneous, but I let solver='GLPK' run for about 15 minutes and did not get an answer.

comment:22 Changed 13 days ago by dcoudert

This is unfortunate.

The only solutions I see are:

  • Force to use PPL, but this is not nice for users with a 64 bits machine (most of the users I guess)
  • Raise an error when the solver is glpk on a 32 bits machine

and none of them are satisfactory.

comment:23 in reply to: ↑ 3 Changed 13 days ago by mkoeppe

Replying to dcoudert:

I suspect that we need to change if M.solve(log = verbose) <= 1: to if M.solve(log = verbose) <= 1 + tol:, where tol = 0 if solver=='PPL' else 1e-6. I don't like this solution, but I don't know what else we can do.

Using a tolerance is exactly the right solution. The test for exact <= 1 and == 1 is meaningless with a numerical LP solver. LP solvers use perturbations systematically. It is not a bug if the result is not an exact integer.

comment:24 Changed 13 days ago by mkoeppe

See also my explanations in https://trac.sagemath.org/ticket/30635#comment:20 and following.

comment:25 follow-up: Changed 12 days ago by dimpase

there are two LPs involved, one of them for a maximum weight matching, something that can be instead done by a combinatorial algorithm, see e.g. Blossom V in http://pub.ist.ac.at/~vnk/software.html

comment:26 Changed 12 days ago by dimpase

If I force PPL on the inner (matching) LP:

  • src/sage/graphs/graph_coloring.pyx

    a b def fractional_chromatic_index(G, solver="PPL", verbose_constraints=False, verbo 
    825825    frozen_edges = [frozenset(e) for e in G.edges(labels=False, sort=False)]
    826826
    827827    # Initialize LP for maximum weight matching
    828     M = MixedIntegerLinearProgram(solver=solver, constraint_generation=True)
     828    M = MixedIntegerLinearProgram(solver="PPL", constraint_generation=True)
    829829
    830830    # One variable per edge
    831831    b = M.new_variable(binary=True, nonnegative=True)

then on a 32-bit system it's all fine (GLPK from the system, unpatched, so these extra messages)

sage: G=graphs.PetersenGraph()
sage: G.fractional_chromatic_index(solver="GLPK")
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
Long-step dual simplex will be used
3.0

comment:27 in reply to: ↑ 25 ; follow-up: Changed 12 days ago by dcoudert

  • Authors set to David Coudert
  • Branch set to public/graphs/23798_fractional_chromatic_index
  • Commit set to ebcde7c37ea3a8377fbcebe2246aae890e9df305
  • Status changed from needs_work to needs_review

Following above discussion, I added a tolerance gap for numerical LP solvers.

Note that we can use the networkx implementation of the blossom algorithm via the matching method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...


New commits:

ebcde7ctrac #23798: add tolerance gap for numerical LP solvers

comment:28 Changed 12 days ago by dimpase

I don’t like this approach. Without explicit guarantees that these tolerances are correct, it is replacing correct algorithms with heuristics.

comment:29 Changed 12 days ago by mkoeppe

         matching = [fe for fe in frozen_edges if M.get_values(b[fe]) == 1]

This line also needs changing because the test "== 1" is not robust.

comment:30 Changed 12 days ago by dimpase

I don’t see how one can make the oracle (the inner LP) inexact, without potentially returning a very wrong answer.

The oracle checks that there is no maximum weight matching of weight >1. Say, we let it error by epsilon, i.e we terminate with oracle returning 1+epsilon. Potentially, there could be K maximum matchings with this weight, if they are disjoint this means that the final error is K times epsilon, oops…

Last edited 12 days ago by dimpase (previous) (diff)

comment:31 Changed 12 days ago by dimpase

  • Reviewers set to Dima Pasechnik
  • Status changed from needs_review to needs_work

comment:32 follow-up: Changed 11 days ago by dcoudert

I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.

comment:33 in reply to: ↑ 32 Changed 11 days ago by dimpase

Replying to dcoudert:

I don't like this solution either but I don't know what to do when a solver returns 0.99999... instead of 1 although we have set the variable type to binary. The solvers are aware of the type of the variable and so should return a value with the correct type and not a double. The solution might be in the backends.

No, my point is that without a special analysis it's not possible to argue that solving the oracle problem (with non-integer objective function) inexactly provides a correct result, even if you "correctly" round 0.9999... to 1. It's because a small oracle error may get amplified a lot in the main LP. Welcome to floating point hell :-)|

comment:34 in reply to: ↑ 27 Changed 10 days ago by dimpase

Replying to dcoudert:

Following above discussion, I added a tolerance gap for numerical LP solvers.

Note that we can use the networkx implementation of the blossom algorithm via the matching method, but it does not solve the issue. Actually, it's slower and worse for the rounding as I observe the issue on a 64 bits machine...

The oracle implementation here is naive, and bound to get very slow; it's integer LP without Edmonds' constraints, instead of a "normal" LP over the matching polytope with Edmonds' constraints (aka blossom inequalities). So this would need yet another oracle (as there are too exponentially many inequalities there), but well, it's polynomial time then. The generated constraints can stay, so this should be fast.


New commits:

ebcde7ctrac #23798: add tolerance gap for numerical LP solvers

comment:35 Changed 6 days ago by mkoeppe

I took a quick look at the function now. I would suggest the following changes:

  1. Before adding a new constraint to the master problem, verify that matching is indeed a matching. In this way, the master problem will always be a correct relaxation, even if an inexact oracle is used.
  1. When the numerical solver that is used for solving the separation problem does not find a matching of value greater than 1 + epsilon, you can switch to PPL - then, with a bit of luck, it can prove the bound <= 1.
  1. It will make sense to have separate parameters for the solver used for the master problem and the one(s) used for the separation problem.

comment:36 Changed 6 days ago by dimpase

Actually, it seems that even with PPL, the code is just wrong, as PPL does not do MILP, it only does LP, right?

comment:37 Changed 6 days ago by mkoeppe

The PPL does have a (very limited) MIP solver.

Note: See TracTickets for help on using tickets.