Opened 8 years ago

Closed 7 years ago

#12736 closed enhancement (fixed)

More solver options for GLPK

Reported by: john_perry Owned by: ncohen
Priority: major Milestone: sage-5.1
Component: linear programming Keywords: solver parameters
Cc: ncohen, r.gaia.cs Merged in: sage-5.1.beta0
Authors: John Perry, Nathann Cohen Reviewers: Nathann Cohen, John Perry
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: #12309 Stopgaps:

Description (last modified by john_perry)

Several requests for solver options to use as parameters for GLPK prompt us to allow the user to set quite a few parameters.

In addition, the name for the solver parameter in ticket #12309 (preprocessing) isn't very accurate (we didn't think it through carefully enough at the time) so that option will change, too.

I propose to change/enable the following solver options. See the GLPK documentation for details on each option, and note that most have two names: one to correspond to GLPK, one for legibility.

  • glpk's solver
    • default is glp_intopt, as at present
    • simplex_first: setting this to True makes the solver invoke glp_simplex before glp_intopt
      • useful to avoid problems in glpk
      • replaces current preprocessing parameter
    • simplex_only: setting this to True makes the solver invoke glp_simplex '''instead''' of glp_intopt`
  • glpk's intopt solver
    • msg_level or verbosity
    • br_tech or branching
    • bt_tech or backtracking
    • pp_tech or preprocessing
    • fp_heur or feasibility_pump
    • gmi_cuts or gomory_cuts
    • mir_cuts or mixed_int_rounding_cuts
    • cov_cuts or mixed_cover_cuts
    • clq_cuts or clique_cuts
    • tol_int or absolute_tolerance
    • tol_obj or relative_tolerance
    • mip_gap or mip_gap_tolerance
    • tm_lim or time_limit
    • out_freq or output_frequency
    • presolve or mip_presolver
    • binarize
  • glpk's simplex solver
    • msg_level or verbosity
    • meth or primal_v_dual
    • pricing
    • r_test or ratio_test
    • tol_bnd or tolerance_primal
    • tol_dj or tolerance_dual
    • tol_piv or tolerance_pivot
    • obj_ll or obj_lower_limit
    • obj_ul or obj_upper_limit
    • it_lim or iteration_limit
    • tm_lim or time_limit
    • out_frq or output_frequency
    • out_dly or output_delay
    • presolve

Feel free to suggest better names, or that we stick with only the GLPK names, etc.

Apply:

Attachments (3)

ticket_12736_add_glpk_solver_options.patch (20.5 KB) - added by r.gaia.cs 8 years ago.
trac_12736.patch (16.0 KB) - added by ncohen 8 years ago.
trac_12736_more_solver_options_for_glpk.patch (32.4 KB) - added by john_perry 8 years ago.

Download all attachments as: .zip

Change History (40)

Changed 8 years ago by r.gaia.cs

comment:1 Changed 8 years ago by r.gaia.cs

  • Cc r.gaia.cs added

comment:2 follow-up: Changed 8 years ago by john_perry

Raniere,

I don't like the idea of using strings for the solver parameter name. This is primarily for efficiency reasons; you have to remember that in my projects, I create a lot of small LPs, & don't want to deal with the string comparisons when an integer comparison will do much better.

I will upload another patch in a little bit (I finally figured out how to do it) and you can tell me what you think of this approach.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 8 years ago by john_perry

Replying to john_perry:

I don't like the idea of using strings for the solver parameter name.

Argh. Expressed myself badly: I meant,

...for the solver parameter value.

Although, given that the name is passed as a string, that may be a dumb objection.

comment:4 follow-up: Changed 8 years ago by ncohen

Helloooooo John !!!

Did you try whether it really makes a difference in practice ? I mean, if you replace the arguments by integers and feed it with the right integer, is it really faster ? I am afraid of Python's own efficiency, and I wonder if this may not be the real bottleneck :-)

Nathann

comment:5 in reply to: ↑ 4 Changed 8 years ago by john_perry

Replying to ncohen:

Did you try whether it really makes a difference in practice ? I mean, if you replace the arguments by integers and feed it with the right integer, is it really faster ? I am afraid of Python's own efficiency, and I wonder if this may not be the real bottleneck :-)

I create & modify these lp's all in Cython.

comment:6 Changed 8 years ago by ncohen

Oh. And does it make a difference, even in Cython ? I thought you were using the MixedIntegerLinearProgram?.solver_parameter function, but it looks like you directly use the backend ;-)

Perhaps the best for you would be to create an empty LP with the right parameters and clone it afterwards !

Nathann

comment:7 in reply to: ↑ 3 ; follow-up: Changed 8 years ago by r.gaia.cs

Replying to john_perry: John, using strings for name and value was my first and only ideia of how to do GLPKBackend.solver_parameter work with MixedIntegerLinearProgram?.solver_parameter with not import GLPKBackend module. For some parameter that only two values are avaliable it's possible to use True and False as values.

Raniere

comment:8 Changed 8 years ago by john_perry

The declaration of the solve() method of the MixedIntegerLinearProgram class is

    def solve(self, solver=None, log=0, objective_only=False):

Within the function we have

        self._backend.set_verbosity(log)

Makes sense, you say, right? Well, no, for three reasons:

  1. It renders moot the msg_lev solver paramter.
  2. It's a bit weird to initialize log to 0, since the solvers are already initialized to shut up in __cinit__ .
  3. Someone might want to see whether the solver is making any progress, by setting the verbosity through the solver parameter, rather than through solve() .

IMHO, the user ought to have that option; otherwise, there's no point to allowing a msg_level option at all. This is easily fixed, too: change the default value of log to None, and then the code to,

        if log != None:
          self._backend.set_verbosity(log)

This should retain compatibility with the previous setup.

What do you guys think?

comment:9 in reply to: ↑ 7 Changed 8 years ago by john_perry

Replying to r.gaia.cs:

using strings for name and value was my first and only ideia of how to do GLPKBackend.solver_parameter work with MixedIntegerLinearProgram?.solver_parameter with not import GLPKBackend module.

That's fine, and it even makes sense. The patch I'm preparing now will allow both approaches.

comment:10 Changed 8 years ago by john_perry

  • Description modified (diff)
  • Status changed from new to needs_review

I've added my version of a patch that should take care of the same thing. Although the coverage is more or less what we discussed, you may not like the approach I used to expose the GLPK constants: basically, I define them as module-level constants. I use this for both the names and the values. That way, the hard parts get done by integers, which should mitigate questions of efficiency. Well, sort of. :-/

The patch probably needs more doctests. Suggestions?

comment:11 follow-up: Changed 8 years ago by john_perry

Sorry -- the last patch was prepared in a hurry, right before teaching a class. I came back & discovered doctests, which are now fixed. This version passes all doctests in sage/numerical, so it should now be a question of adding new ones, & deciding if this captures the interface we really want.

comment:12 in reply to: ↑ 11 ; follow-ups: Changed 8 years ago by r.gaia.cs

Replying to john_perry: I look very quickly your patch and I like it. IMHO, is better pass the number values in miliseconds as in GLPK.

comment:13 in reply to: ↑ 12 Changed 8 years ago by john_perry

Replying to r.gaia.cs:

IMHO, is better pass the number values in miliseconds as in GLPK.

Thanks. I remember that I wanted to check that, but forgot. I'll fix that once I have more input.

comment:14 in reply to: ↑ 12 ; follow-up: Changed 8 years ago by john_perry

Replying to r.gaia.cs:

IMHO, is better pass the number values in miliseconds as in GLPK.

While working on this tonight, I realized that mip.pyx expects the timelimit parameter for both CPLEX and GLPK to be in seconds. To get around this, I added a new parameter, timelimit_intopt and timelimit_simplex; these accept milliseconds.

I also added a couple of new doctests, one to test this difference.

I would like to doctest output when we set msg_level to something other than ...off, but apparently the doctest facility doesn't recognize output produced by linked C code, only output produced by Python or Cython. If anyone has an idea on how to make that work, let me know.

Otherwise, I guess this is ready. (I've tested it with some of my own code, too.)

NEEDS REVIEW.

comment:15 in reply to: ↑ 14 ; follow-ups: Changed 8 years ago by ncohen

Helloooooooo John !!

While working on this tonight, I realized that mip.pyx expects the timelimit parameter for both CPLEX and GLPK to be in seconds. To get around this, I added a new parameter, timelimit_intopt and timelimit_simplex; these accept milliseconds.

Hmmm... Why do you name this parameter timelimit_intopt or timelimit_simplex and not timelimit_seconds, or something like that ? What does it specifically have to do with simplex or intopt ? I think it would be better to have only one parameter for timelimit, and if you think that milliseconds are better what about having only one parameter which accepts milliseconds ? By the way, it is already possible to define the timelimit up to milisecond with this argument. The code actually multiplies the input by 1000 before updating GLPK's value :-)

Nathann

comment:16 in reply to: ↑ 15 Changed 8 years ago by ncohen

Hmmmmmm O_o

I quickly looked at the code again and you apparently also thought of a timelimit_seconds parameter, though it appears in the code and not in the documentation ^^;

Nathann

comment:17 in reply to: ↑ 15 Changed 8 years ago by john_perry

Replying to ncohen:

Hmmm... Why do you name this parameter timelimit_intopt or timelimit_simplex and not timelimit_seconds, or something like that ?

I guess, just to be as specific as possible... in case someone needed it. I can't foresee all uses. But, for example, if someone wants to run simplex_then_intopt, she might want to set timelimit_intopt much higher than timelimit_simplex, because she knows that, in general, integer optimization takes much longer. Whereas if simplex doesn't end relatively quickly, there could be a bigger problem altogether.

comment:18 follow-up: Changed 8 years ago by ncohen

Oh, you mean that timelimit_intopt limits the time it takes for the whole ILP to be solved while timelimit_simplex just bounds the time it takes to run the simplex algorithm for a *relaxed* instance.. Okok, I just didn't get that part :-)

The list of parameters is so long that I begin to look for a way to do it with dichotomy instead of trying all the different parameters, but that would really be messy :-D

By the way, about the string and integer values for the parameters.. Does it really make a meaningful difference in runtimes ? Just being curious ^^;

Nathann

comment:19 in reply to: ↑ 18 ; follow-up: Changed 8 years ago by john_perry

Replying to ncohen:

The list of parameters is so long that I begin to look for a way to do it with dichotomy instead of trying all the different parameters, but that would really be messy :-D

Yeah. The long list of parameters can be a drag, but that's the price of enabling everything in GLPK. Alternately, we could decide to take a bunch of things off (would people actually use them?) but I'll bet that if we remove Gomory cuts, say, then we'll hear something in six weeks. ;-)

By the way, about the string and integer values for the parameters.. Does it really make a meaningful difference in runtimes ? Just being curious ^^;

If your focus is a handful of large linear programs, the penalty of Python strings v. machine integers might not matter. If your focus is a few thousand small linear programs, it could be an issue -- especially if you start mucking around w/options quite a bit.

Admittedly, in my own work I don't seem to need the options after all, and I've managed to revise, revise, revise my approach until I'm now creating only a few tens of programs, instead of a few thousand. So it's not that big an issue for me as it used to be. But it could become an issue again in the future.

comment:20 in reply to: ↑ 19 ; follow-up: Changed 8 years ago by ncohen

Hellooooooooo !!!

Yeah. The long list of parameters can be a drag, but that's the price of enabling everything in GLPK. Alternately, we could decide to take a bunch of things off (would people actually use them?) but I'll bet that if we remove Gomory cuts, say, then we'll hear something in six weeks. ;-)

Oh, I really do not mind, I was just looking at the code and thinking that if the user wanted to set the LAST parameter the code would test all others first. That's why I would have wanted some dichotomy ! :-D But that is just a natural reflex. Of course it is muuuuch better implemented like that.

If your focus is a handful of large linear programs, the penalty of Python strings v. machine integers might not matter. If your focus is a few thousand small linear programs, it could be an issue -- especially if you start mucking around w/options quite a bit.

Admittedly, in my own work I don't seem to need the options after all, and I've managed to revise, revise, revise my approach until I'm now creating only a few tens of programs, instead of a few thousand. So it's not that big an issue for me as it used to be. But it could become an issue again in the future.

Well, I was just surprised that comparing integers was *that much shorter* than comparing strings. And that it could become the bottleneck in LP codes :-)

Ok, I should give this code a proper review now !

Nathann

comment:21 in reply to: ↑ 20 Changed 8 years ago by john_perry

Replying to ncohen:

I was just looking at the code and thinking that if the user wanted to set the LAST parameter the code would test all others first. That's why I would have wanted some dichotomy ! :-D

That bothers me, too. I guess it should be organized so that the options most likely to come first would be tested first -- OR -- organize an if tree so that we have O(log(n)) comparisons instead of O(n). I'm completely open to suggestions on that.

Well, I was just surprised that comparing integers was *that much shorter* than comparing strings.

I don't have hard tests, but it seems pretty obvious; string comparison is quite a bit more expensive.

And that it could become the bottleneck in LP codes :-)

Unlikely, of course, but every bit helps :-)

comment:22 follow-up: Changed 8 years ago by ncohen

Hellooooooooo John !!

Since #12816 I am really into Sphinx tables, so perhaps it was a bad idea in this case... But what do you think of the following patch ? It applies on top of yours :-)

There is nothing else that seems wrong at this level... Only some details in the documentation. Give me your advice on that ! :-)

Nathann

Changed 8 years ago by ncohen

comment:23 Changed 8 years ago by ncohen

  • Description modified (diff)

comment:24 in reply to: ↑ 22 ; follow-up: Changed 8 years ago by john_perry

Replying to ncohen:

Since #12816 I am really into Sphinx tables...

I hadn't heard of them. Okay, the update is merely to the documentation. What's a good way of viewing it? Should it work even in text mode?

comment:25 in reply to: ↑ 24 ; follow-up: Changed 8 years ago by ncohen

I hadn't heard of them. Okay, the update is merely to the documentation. What's a good way of viewing it? Should it work even in text mode?

No, it should change nothing to the text mode, but the html doc should be much better !!

To compile it :

sage -b <your_branch> && sage -docbuild reference html

Nathann

comment:26 in reply to: ↑ 25 Changed 8 years ago by john_perry

Replying to ncohen:

To compile it :

sage -b <your_branch> && sage -docbuild reference html

Building the reference will take a while, won't it? If so, I'll wait until later to do this.

comment:27 Changed 8 years ago by ncohen

It depends on your luck. Usually it just compiles "the difference", but with some Sage install it may mean building them from the beginning :-)

You can add a "nice" before, if that is too heavy : sage -b <your_branch> && nice sage -docbuild reference html

Nathann

comment:28 Changed 8 years ago by john_perry

  • Dependencies set to #12309
  • Status changed from needs_review to needs_work

I am now getting, on two different machines, doctest failures in mip.pyx when #12736 is applied. They report different lines (different betas, different patches applied, I guess) but the basic idea is this:

Expected:
    Traceback (most recent call last):
    ...
    MIPSolverException: 'GLPK : Solution is undefined'
...
Got:
    Traceback (most recent call last):
    ...
    MIPSolverException: 'GLPK: An error occurred in the solver'

I At first, I thought it was just a matter of the space, but then I realized the errors were different, too. I'm going to look into this real quick.

Also, I added a dependency to the ticket description.

comment:29 Changed 8 years ago by john_perry

Okay, it looks as if I had forgotten to test mip.pyx this time! Basically, I had added to glpk_backend's solve() method a test for correctness when the timer runs out. I did this because otherwise mip.pyx thought there was a solution. (I don't remember exactly why.)

For now, I've changed the expected string in the doctest to reflect what the backend currently has, but another (better?) mofidication would be to change the backend to test for the precise signal (GLP_ETMLIM) and raise an exception telling the user that time ran out, so the solution will be invalid.

Which do youprefer?

comment:30 follow-up: Changed 8 years ago by ncohen

Helloooooooo John !!

I see no new method in your patch, but it is a bit hard to see what you *changed* inside of this large patch. By the way, if you modify your patch while mine already applies on top of it, there's a chance that it will break afterwards ^^;

I do not use this timelimit option myself, but it would surely be nice to be able to know whether the value of the variables when GLPK stops has any meaning.... Would it be possible to return a MIPSolverException or something similar when the computations stopped because of time ? It could probably detected along with the other ones (unbounded/impossible/no integer solution/...).

Nathann

comment:31 in reply to: ↑ 30 Changed 8 years ago by john_perry

Replying to ncohen:

I see no new method in your patch...

Correct. There wasn't one. I only changed a doctest.

...but it is a bit hard to see what you *changed* inside of this large patch. By the way, if you modify your patch while mine already applies on top of it, there's a chance that it will break afterwards ^^;

I've tried it with your patch on top, too. :-)

Would it be possible to return a MIPSolverException or something similar when the computations stopped because of time ?

Of course. I'll do that.

Changed 8 years ago by john_perry

comment:32 Changed 8 years ago by john_perry

  • Status changed from needs_work to needs_review

I've uploaded a new patch that restores things as they were.

Also, perhaps you wrote the wrong thing earlier: it occurred to me later that I might have uploaded the wrong patch... This one is definitely different, as you can find the GLP_ETMLIM and GLP_EITLIM signals used.

I looked at the generated html documentation, and it looked great! So, if you're okay with the current patch, I think we can award it a positive review?

comment:33 Changed 8 years ago by ncohen

  • Status changed from needs_review to positive_review

Helloooooooo John !!

Well, if you also agree with my changes then I guess this patch can go ! I regret that it is hard to test these timelimit exceptions, though I did a test myself by modifying a hard LP and solving it with a time limit. The exception is indeed thrown, and besides the last message printed by GLPK is

TIME LIMIT EXCEEDED; SEARCH TERMINATED

So well... It will show anyway :-)

On the practical side, all doctests pass with GLPK as the default solver, the documentation is nice, well. Good addition :-)

Nathann

comment:34 Changed 8 years ago by jdemeyer

Please fill in the Author/Reviewer? fields with your real names.

comment:35 Changed 8 years ago by john_perry

  • Authors set to John Perry, Nathann Cohen
  • Description modified (diff)
  • Reviewers set to Nathann Cohen, John Perry

comment:36 Changed 8 years ago by jdemeyer

  • Milestone changed from sage-5.0 to sage-5.1

comment:37 Changed 7 years ago by jdemeyer

  • Merged in set to sage-5.1.beta0
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.