Opened 11 years ago

Last modified 11 years ago

#10341 closed enhancement

make MIP backend interface more Python-ic — at Version 33

Reported by: malb Owned by: ncohen
Priority: major Milestone: sage-4.6.2
Component: linear programming Keywords: LP, MIP
Cc: ncohen Merged in:
Authors: Martin Albrecht, Nathann Cohen Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by ncohen)

Sage 4.6.1 will contain a new set of backend classes for mixed integer programming, which will make it easier to write interfaces for other solvers. There has been some off-list discussion about this interface and the follow changes were agreed upon:

  • :func: add_linear_constraint should allow lb and ub instead of direction and one bound, it's more expressive.
  • :func:add_variable should return the index of the newly created variable instead of the next index.
  • change :func:add_linear_constraint to accept any iterable of the form [(c,v) ...]
  • min and max should be lower bound (or lb) and upper bound (or ub) to conform to MIP conventions
  • allow parameters in :func:add_variable

The patches are to be applied in this order:

  • mip_interface_changes.patch
  • trac_10431-part2.patch
  • trac_10431-part3.patch
  • trac_10431-part4.patch

Change History (36)

Changed 11 years ago by malb

comment:1 Changed 11 years ago by malb

  • Status changed from new to needs_work

The attached patch makes the necessary changes to the GenericBackend? and the GLPK class. Coin and CPLEX are not done yet.

comment:2 Changed 11 years ago by ncohen

  • Status changed from needs_work to needs_review

And here are the modifications to CPLEX and Coin. The doctests pass, plus all the methods of Graph/Digraph/GenericGraph? pass with both GLPK, Coin and CPLEX.

I agreed with you patch almost everywhere ! The only modification I made is the "constraints" argument you had placed in the add_constraint method. I really didn't get why you picked this name. I made it "coefficients", as the pairs are actually the coefficient of the sparse matrix, but of course we can discuss it during the review :-)

Nathann

Changed 11 years ago by ncohen

comment:3 Changed 11 years ago by ncohen

  • Authors changed from Martin Albrecht to Martin Albrecht, Nathann Cohen
  • Description modified (diff)

comment:4 follow-up: Changed 11 years ago by malb

Sure, your naming makes much more sense. Are you reviewing mine first, or is your patch your review? I can review your's then?

comment:5 follow-up: Changed 11 years ago by malb

I noticed two more interface "issues" we may decide to "fix":

  • p.row() returns two lists (indices,coeffs) right now, we could change it to return [(v1,c1),...,(vn,cn)].
  • We could allow an optional obj parameter which contains the coefficient of the new variable in the objective function?

Neither of those changes I view as mandatory, just throwing the idea out there to see what others think.

comment:6 in reply to: ↑ 4 Changed 11 years ago by ncohen

Sure, your naming makes much more sense. Are you reviewing mine first, or is your patch your review?

Well, I checked yours while working on it and fixed this "constraints->coefficients" inside of my patch ... If you review mine, both files will have be read by two sets of eyes, so I guess it would count as a valid review :-)

Nathann

comment:7 in reply to: ↑ 5 Changed 11 years ago by ncohen

Replying to malb:

I noticed two more interface "issues" we may decide to "fix":

  • p.row() returns two lists (indices,coeffs) right now, we could change it to return [(v1,c1),...,(vn,cn)].
  • We could allow an optional obj parameter which contains the coefficient of the new variable in the objective function?

Neither of those changes I view as mandatory, just throwing the idea out there to see what others think.

Though they sound right... :-)

Nathann

comment:8 Changed 11 years ago by malb

One more thing: you assume in your interface that one can change column and row names after those have been created. However, this isn't easy in SCIP (I could possibly hack around it by hacking around the API). Can we change that, e.g. by adding name as an optional parameter to add_variable() and add_linear_constraint()?

comment:9 Changed 11 years ago by ncohen

Well, if that's how SCIP's interface is written, I see no way around that .... :-/

I just hope all these optional parameters won't hurt the interface's efficiency. I am eager to have a working stable version of it in Sage, that would let me rebase other patches that have been made invalid because of the work we are doing here. I may spend some time trying to do some performance analysis on the interface afterwards :-D

I'm already sick of these names... They are totally useless for all practical purposes, their only aim is to produce different results in p.show() (which doesn't even use them for the moment) or when writing the LP to files, which is totally orthogonal to actually *solving* them. Besides, each solver has a different way of managing them, especially Coin. I spent nights just fighting with it, for nothing in the end (oh yes.. it compiles, and there was no regression since the first version of LP.. Who the hell cares ?)

Just to know : are you yourself using names for variables and constraints ?

Nathann

P.S. : Oh, yes.. Let's add this optional argument if there is no way around. If at some point we find a trick, it won't be as hard to remove an optional argument from an interface as any other, as most of the method calls don't include it.

So in the end

comment:10 Changed 11 years ago by malb

I usually set them (e.g. when converting from polynomial systems), but I guess I'm not really using them usually.

comment:11 Changed 11 years ago by ncohen

So what are they useful for in SCIP ? Can it produce outputs at .lp or .mpw files ?

Nathann

comment:12 follow-up: Changed 11 years ago by malb

Yep, and other formats. You can also query a variable (which is a C struct) for its name in your code. For example, you can search for variables by name.

comment:13 in reply to: ↑ 12 Changed 11 years ago by ncohen

Replying to malb:

Yep, and other formats.

Nice !!

You can also query a variable (which is a C struct) for its name in your code. For example, you can search for variables by name.

Same in GLPK. For the moment, I do not do any work on the matrix once it has been defined. Or rather, I only add more information, but never remove/read anything from it O_o

Nathann

comment:14 Changed 11 years ago by malb

  • Status changed from needs_review to needs_work

Okay, what remains to be done then:

  • drop setting a variable name by col_name()
  • drop setting a constraint name by row_name()
  • add an optional parameter name to add_variable and add_linear_constraint
  • (optional) make row() return a list [(c1,v1),...,(cn,vn)]
  • (optional) add an optional parameter obj to add_variable
  • (optional) unify get/set_objective_value

Anything I missed?

comment:15 Changed 11 years ago by ncohen

Oh... Hem.. Actually, if we are to actually *drop* these methods... -_-

I can not stand these names... Here is the problem : right now, there are no names when you add variables or constraints in MixedIntegerLinearProgram?. When p.show() is called, the names are filled with something which makes them at least distinct and recognizable... But if we want to drop these methods, then p.show() has to be rewritten too.. -_-

Nathann

comment:16 follow-up: Changed 11 years ago by malb

I would suggest an optional parameter "name=None" for add_variable()?

comment:17 in reply to: ↑ 16 Changed 11 years ago by ncohen

Replying to malb:

I would suggest an optional parameter "name=None" for add_variable()?

If you need it for SCIP... :-/

Changed 11 years ago by malb

comment:18 Changed 11 years ago by malb

  • Description modified (diff)

comment:19 Changed 11 years ago by malb

I did the changes below for the GenericBackend? and GLPK. Nathann, can you do Coin and CPLEX? I'll update my SCIP interface to see whether now I can pass all doctests with it?

  • drop setting a variable name by col_name()
  • drop setting a constraint name by row_name()
  • add an optional parameter name to add_variable and add_linear_constraint
  • add an optional parameter obj to add_variable
  • unify get/set_objective_coefficient

comment:20 Changed 11 years ago by ncohen

Hello !!!

Martin, the point is that I do not know how to get rid of setting names through col_name and row_name for the moment ! It would mean show() would have to be rewritten, and I do not know how to do it nicely ! :-/

If you have any idea...

Nathann

comment:21 Changed 11 years ago by ncohen

My best bet would be to keep the name argument in row/col_name, and have SCIP ignore it when it is given or raise an exception. It is already the case for Coin which does not accept names for everything. We could then define a name when the variable/constraint is added (which would work for SCIP), or later on (which wouldn't for SCIP)..

Then again, dropping everything name-related may be the best idea in the end. It's totally useless, it adds arguments/methods everywhere, and no two solvers handle it the same way. What do you think of it ?

Nathann

comment:22 follow-up: Changed 11 years ago by malb

Did you check my patch? For some reason p.show() still works, or is the doctest not enough to ensure this?

comment:23 in reply to: ↑ 22 Changed 11 years ago by ncohen

Replying to malb:

Did you check my patch? For some reason p.show() still works, or is the doctest not enough to ensure this?

Well, if your doctests pass then I guess the docstring is not enough. Anyway, for the moment the variable's names are just "cached" inside of MixedIntegerLinearProgram?, because I thought stupid to call routines to updates the names unless someone wants to show() it or to export it to a file (which one never does when actually trying to solve them). There is a method named update_variable_name, which you commented in your patch, which does just that : before print show() or exporting the result, it updates the variable's names. This is the only way for the variables to have a name, and as it is usually done after the LP has been generated, it has to use the set_col_name commands. I remember having written somewhere that it was broken at some point, but I am never too eager to deal with this name mess.

Oh, of course everything works fine if you comment #update_variable_names. The .show() method will just print x_1, x_2, etc and the .lp and .mps files will have the same kind of names. I do not think it is that awful, and removing all this name-related stuff from the class will just make it easier to work with.

I mean, it would have been nice to be able to deal with names, but :

  • The solvers all have a different way to deal with them. If we want to be able to deal with them all, we'll just end up with an awkward interface
  • You don't use them, I don't use them, they play no part in the actual solving of LP which stil;l has to be considered the main point here
  • The LP file formats deal with names, the solver's libraries too, but then again people write LP in languages like AMPL or Mathprog (for GLPK), which have a very basic set of commands available as they are only meant to deal with LP. In Sage, we have Python which is infinitely more expressive...

It looks like a lot of work for something we're not even sure we can use...

Nathann

comment:24 follow-up: Changed 11 years ago by malb

So what about:

  • If the user wants to use names then (s)he has to provide them at creation time of variables or constraints.
  • Solvers which don't support them, simply ignore them.
  • In show, if there is a custom name set (queried by backend.col_name() and backend.row_name()) then it is used for printing
  • When writing lp we do the same?

comment:25 in reply to: ↑ 24 Changed 11 years ago by ncohen

Replying to malb:

So what about:

  • If the user wants to use names then (s)he has to provide them at creation time of variables or constraints.
  • Solvers which don't support them, simply ignore them.
  • In show, if there is a custom name set (queried by backend.col_name() and backend.row_name()) then it is used for printing
  • When writing lp we do the same?

.....

Sounds logic enough :-p

It just exposes I had a stupid idea when I wrote this set_row_name in the first place. And I guess it's only fitting that I clean my own mess. Please comment the code from mip.py you need to make your patch run and pass doctests, and I'll change what is needed there when dealing with Coin and CPLEX !

Nathann

comment:26 Changed 11 years ago by malb

part3 passes doctests for me. However, so far it simply ignores names.

comment:27 Changed 11 years ago by malb

PS: Can you give me some examples which take about 1 minute to compute with GLPK from the graph module? I want to test the performance of the SCIP solver for those.

comment:28 Changed 11 years ago by ncohen

Oh ! To compare performances, you could just try a sage -t on graph.py digraph.py and generic_graph.py.

I guess the performances would then depend on the kind of problem : all the solvers I tried until now are just awful for the minor method :

sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "GLPK")
CPU times: user 10.59 s, sys: 0.01 s, total: 10.60 s
Wall time: 10.62 s
{0: [(1, 3), (1, 2), (3, 3), (4, 4), (1, 4), (2, 3), (3, 4)], 1: [(2, 1), (2, 2), (3, 2), (4, 2), (2, 0)], 2: [(0, 0), (1, 0), (0, 1), (0, 2)], 3: [(1, 1)]}

The methods raises an exception when 4 becomes 5, which is natural as the problem has no solution, but it takes one lifetime to get this answer from GLPK. If SCIP can get it quickly, that's just a miracle :-D

Another problem of the same kind :

n = 7
graphs.Grid2dGraph(n,n).disjoint_routed_paths([((0,0), (n-1,n-1)), ((0, n-1), (n-1, 0))])

This already takes 10 seconds with CPLEX. Try n=8 if you are patient !

The answer is also an infeasibility, and the solvers are having a hard time proving it.. But I guess it just comes from my bad LP formulation ^^;

A different type of LP formulation, on random graphs :

from sage.graphs.graph_coloring import edge_coloring
%timeit edge_coloring(graphs.RandomGNP(30,.3), solver = "GLPK", verbose = 2)
5 loops, best of 3: 4.47 s per loop

Nathann

comment:29 follow-up: Changed 11 years ago by malb

On my machine:

sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "GLPK", verbose=0)
CPU times: user 2.81 s, sys: 0.00 s, total: 2.81 s
Wall time: 2.82 s
{0: [(1, 3), (1, 2), (1, 1), (2, 3), (2, 4)], 1: [(3, 0), (1, 0), (3, 1), (2, 0)], 2: [(2, 1)], 3: [(4, 4), (2, 2), (3, 2), (4, 2), (4, 3), (3, 4)]}

It seems there's still a bug in my SCIP interface:

sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "SCIP", verbose=0)
CPU times: user 1.88 s, sys: 0.04 s, total: 1.92 s
Wall time: 1.94 s
{0: [(0, 0), (0, 1)], 1: [(1, 1)], 2: [(2, 1), (2, 2), (1, 0), (2, 0)], 3: [(1, 2), (0, 2)]}

comment:30 in reply to: ↑ 29 Changed 11 years ago by ncohen

Replying to malb:

It seems there's still a bug in my SCIP interface:

Why so ? The answer is not unique ! :-)

Nathann

comment:31 Changed 11 years ago by malb

That's a critical piece of information I missed :)

comment:32 Changed 11 years ago by ncohen

  • Status changed from needs_work to needs_review

Plus CPLEX, plus Coin ! :-)

Nathann

comment:33 Changed 11 years ago by ncohen

  • Description modified (diff)
Note: See TracTickets for help on using tickets.