Opened 8 years ago

Closed 8 years ago

#10341 closed enhancement (fixed)

make MIP backend interface more Python-ic

Reported by: malb Owned by: ncohen
Priority: major Milestone: sage-4.6.2
Component: linear programming Keywords: LP, MIP
Cc: ncohen Merged in: sage-4.6.2.alpha3
Authors: Martin Albrecht, Nathann Cohen Reviewers: Martin Albrecht, Nathann Cohen
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by ncohen)

Sage 4.6.1 will contain a new set of backend classes for mixed integer programming, which will make it easier to write interfaces for other solvers. There has been some off-list discussion about this interface and the follow changes were agreed upon:

  • :func: add_linear_constraint should allow lb and ub instead of direction and one bound, it's more expressive.
  • :func:add_variable should return the index of the newly created variable instead of the next index.
  • change :func:add_linear_constraint to accept any iterable of the form [(c,v) ...]
  • min and max should be lower bound (or lb) and upper bound (or ub) to conform to MIP conventions
  • allow parameters in :func:add_variable

The patches are to be applied in this order:

  • mip_interface_changes.patch
  • trac_10431-part2.patch
  • trac_10431-part3.patch
  • trac_10431-part4.patch
  • trac_10431-part5.patch
  • trac_10431-part6.patch

Attachments (7)

mip_interface_changes.patch (58.9 KB) - added by malb 8 years ago.
trac_10431-part2.patch (57.7 KB) - added by ncohen 8 years ago.
trac_10431-part3.patch (30.3 KB) - added by malb 8 years ago.
trac_10431-part4.patch (29.2 KB) - added by ncohen 8 years ago.
trac_10431-part5.patch (14.5 KB) - added by ncohen 8 years ago.
mip_interface_changes-fixed_commit_message.patch (58.9 KB) - added by ncohen 8 years ago.
trac_10431-part6.patch (584 bytes) - added by ncohen 8 years ago.

Download all attachments as: .zip

Change History (65)

Changed 8 years ago by malb

comment:1 Changed 8 years ago by malb

  • Status changed from new to needs_work

The attached patch makes the necessary changes to the GenericBackend? and the GLPK class. Coin and CPLEX are not done yet.

comment:2 Changed 8 years ago by ncohen

  • Status changed from needs_work to needs_review

And here are the modifications to CPLEX and Coin. The doctests pass, plus all the methods of Graph/Digraph/GenericGraph? pass with both GLPK, Coin and CPLEX.

I agreed with you patch almost everywhere ! The only modification I made is the "constraints" argument you had placed in the add_constraint method. I really didn't get why you picked this name. I made it "coefficients", as the pairs are actually the coefficient of the sparse matrix, but of course we can discuss it during the review :-)

Nathann

Changed 8 years ago by ncohen

comment:3 Changed 8 years ago by ncohen

  • Authors changed from Martin Albrecht to Martin Albrecht, Nathann Cohen
  • Description modified (diff)

comment:4 follow-up: Changed 8 years ago by malb

Sure, your naming makes much more sense. Are you reviewing mine first, or is your patch your review? I can review your's then?

comment:5 follow-up: Changed 8 years ago by malb

I noticed two more interface "issues" we may decide to "fix":

  • p.row() returns two lists (indices,coeffs) right now, we could change it to return [(v1,c1),...,(vn,cn)].
  • We could allow an optional obj parameter which contains the coefficient of the new variable in the objective function?

Neither of those changes I view as mandatory, just throwing the idea out there to see what others think.

comment:6 in reply to: ↑ 4 Changed 8 years ago by ncohen

Sure, your naming makes much more sense. Are you reviewing mine first, or is your patch your review?

Well, I checked yours while working on it and fixed this "constraints->coefficients" inside of my patch ... If you review mine, both files will have be read by two sets of eyes, so I guess it would count as a valid review :-)

Nathann

comment:7 in reply to: ↑ 5 Changed 8 years ago by ncohen

Replying to malb:

I noticed two more interface "issues" we may decide to "fix":

  • p.row() returns two lists (indices,coeffs) right now, we could change it to return [(v1,c1),...,(vn,cn)].
  • We could allow an optional obj parameter which contains the coefficient of the new variable in the objective function?

Neither of those changes I view as mandatory, just throwing the idea out there to see what others think.

Though they sound right... :-)

Nathann

comment:8 Changed 8 years ago by malb

One more thing: you assume in your interface that one can change column and row names after those have been created. However, this isn't easy in SCIP (I could possibly hack around it by hacking around the API). Can we change that, e.g. by adding name as an optional parameter to add_variable() and add_linear_constraint()?

comment:9 Changed 8 years ago by ncohen

Well, if that's how SCIP's interface is written, I see no way around that .... :-/

I just hope all these optional parameters won't hurt the interface's efficiency. I am eager to have a working stable version of it in Sage, that would let me rebase other patches that have been made invalid because of the work we are doing here. I may spend some time trying to do some performance analysis on the interface afterwards :-D

I'm already sick of these names... They are totally useless for all practical purposes, their only aim is to produce different results in p.show() (which doesn't even use them for the moment) or when writing the LP to files, which is totally orthogonal to actually *solving* them. Besides, each solver has a different way of managing them, especially Coin. I spent nights just fighting with it, for nothing in the end (oh yes.. it compiles, and there was no regression since the first version of LP.. Who the hell cares ?)

Just to know : are you yourself using names for variables and constraints ?

Nathann

P.S. : Oh, yes.. Let's add this optional argument if there is no way around. If at some point we find a trick, it won't be as hard to remove an optional argument from an interface as any other, as most of the method calls don't include it.

So in the end

comment:10 Changed 8 years ago by malb

I usually set them (e.g. when converting from polynomial systems), but I guess I'm not really using them usually.

comment:11 Changed 8 years ago by ncohen

So what are they useful for in SCIP ? Can it produce outputs at .lp or .mpw files ?

Nathann

comment:12 follow-up: Changed 8 years ago by malb

Yep, and other formats. You can also query a variable (which is a C struct) for its name in your code. For example, you can search for variables by name.

comment:13 in reply to: ↑ 12 Changed 8 years ago by ncohen

Replying to malb:

Yep, and other formats.

Nice !!

You can also query a variable (which is a C struct) for its name in your code. For example, you can search for variables by name.

Same in GLPK. For the moment, I do not do any work on the matrix once it has been defined. Or rather, I only add more information, but never remove/read anything from it O_o

Nathann

comment:14 Changed 8 years ago by malb

  • Status changed from needs_review to needs_work

Okay, what remains to be done then:

  • drop setting a variable name by col_name()
  • drop setting a constraint name by row_name()
  • add an optional parameter name to add_variable and add_linear_constraint
  • (optional) make row() return a list [(c1,v1),...,(cn,vn)]
  • (optional) add an optional parameter obj to add_variable
  • (optional) unify get/set_objective_value

Anything I missed?

comment:15 Changed 8 years ago by ncohen

Oh... Hem.. Actually, if we are to actually *drop* these methods... -_-

I can not stand these names... Here is the problem : right now, there are no names when you add variables or constraints in MixedIntegerLinearProgram?. When p.show() is called, the names are filled with something which makes them at least distinct and recognizable... But if we want to drop these methods, then p.show() has to be rewritten too.. -_-

Nathann

comment:16 follow-up: Changed 8 years ago by malb

I would suggest an optional parameter "name=None" for add_variable()?

comment:17 in reply to: ↑ 16 Changed 8 years ago by ncohen

Replying to malb:

I would suggest an optional parameter "name=None" for add_variable()?

If you need it for SCIP... :-/

Changed 8 years ago by malb

comment:18 Changed 8 years ago by malb

  • Description modified (diff)

comment:19 Changed 8 years ago by malb

I did the changes below for the GenericBackend? and GLPK. Nathann, can you do Coin and CPLEX? I'll update my SCIP interface to see whether now I can pass all doctests with it?

  • drop setting a variable name by col_name()
  • drop setting a constraint name by row_name()
  • add an optional parameter name to add_variable and add_linear_constraint
  • add an optional parameter obj to add_variable
  • unify get/set_objective_coefficient

comment:20 Changed 8 years ago by ncohen

Hello !!!

Martin, the point is that I do not know how to get rid of setting names through col_name and row_name for the moment ! It would mean show() would have to be rewritten, and I do not know how to do it nicely ! :-/

If you have any idea...

Nathann

comment:21 Changed 8 years ago by ncohen

My best bet would be to keep the name argument in row/col_name, and have SCIP ignore it when it is given or raise an exception. It is already the case for Coin which does not accept names for everything. We could then define a name when the variable/constraint is added (which would work for SCIP), or later on (which wouldn't for SCIP)..

Then again, dropping everything name-related may be the best idea in the end. It's totally useless, it adds arguments/methods everywhere, and no two solvers handle it the same way. What do you think of it ?

Nathann

comment:22 follow-up: Changed 8 years ago by malb

Did you check my patch? For some reason p.show() still works, or is the doctest not enough to ensure this?

comment:23 in reply to: ↑ 22 Changed 8 years ago by ncohen

Replying to malb:

Did you check my patch? For some reason p.show() still works, or is the doctest not enough to ensure this?

Well, if your doctests pass then I guess the docstring is not enough. Anyway, for the moment the variable's names are just "cached" inside of MixedIntegerLinearProgram?, because I thought stupid to call routines to updates the names unless someone wants to show() it or to export it to a file (which one never does when actually trying to solve them). There is a method named update_variable_name, which you commented in your patch, which does just that : before print show() or exporting the result, it updates the variable's names. This is the only way for the variables to have a name, and as it is usually done after the LP has been generated, it has to use the set_col_name commands. I remember having written somewhere that it was broken at some point, but I am never too eager to deal with this name mess.

Oh, of course everything works fine if you comment #update_variable_names. The .show() method will just print x_1, x_2, etc and the .lp and .mps files will have the same kind of names. I do not think it is that awful, and removing all this name-related stuff from the class will just make it easier to work with.

I mean, it would have been nice to be able to deal with names, but :

  • The solvers all have a different way to deal with them. If we want to be able to deal with them all, we'll just end up with an awkward interface
  • You don't use them, I don't use them, they play no part in the actual solving of LP which stil;l has to be considered the main point here
  • The LP file formats deal with names, the solver's libraries too, but then again people write LP in languages like AMPL or Mathprog (for GLPK), which have a very basic set of commands available as they are only meant to deal with LP. In Sage, we have Python which is infinitely more expressive...

It looks like a lot of work for something we're not even sure we can use...

Nathann

comment:24 follow-up: Changed 8 years ago by malb

So what about:

  • If the user wants to use names then (s)he has to provide them at creation time of variables or constraints.
  • Solvers which don't support them, simply ignore them.
  • In show, if there is a custom name set (queried by backend.col_name() and backend.row_name()) then it is used for printing
  • When writing lp we do the same?

comment:25 in reply to: ↑ 24 Changed 8 years ago by ncohen

Replying to malb:

So what about:

  • If the user wants to use names then (s)he has to provide them at creation time of variables or constraints.
  • Solvers which don't support them, simply ignore them.
  • In show, if there is a custom name set (queried by backend.col_name() and backend.row_name()) then it is used for printing
  • When writing lp we do the same?

.....

Sounds logic enough :-p

It just exposes I had a stupid idea when I wrote this set_row_name in the first place. And I guess it's only fitting that I clean my own mess. Please comment the code from mip.py you need to make your patch run and pass doctests, and I'll change what is needed there when dealing with Coin and CPLEX !

Nathann

comment:26 Changed 8 years ago by malb

part3 passes doctests for me. However, so far it simply ignores names.

comment:27 Changed 8 years ago by malb

PS: Can you give me some examples which take about 1 minute to compute with GLPK from the graph module? I want to test the performance of the SCIP solver for those.

comment:28 Changed 8 years ago by ncohen

Oh ! To compare performances, you could just try a sage -t on graph.py digraph.py and generic_graph.py.

I guess the performances would then depend on the kind of problem : all the solvers I tried until now are just awful for the minor method :

sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "GLPK")
CPU times: user 10.59 s, sys: 0.01 s, total: 10.60 s
Wall time: 10.62 s
{0: [(1, 3), (1, 2), (3, 3), (4, 4), (1, 4), (2, 3), (3, 4)], 1: [(2, 1), (2, 2), (3, 2), (4, 2), (2, 0)], 2: [(0, 0), (1, 0), (0, 1), (0, 2)], 3: [(1, 1)]}

The methods raises an exception when 4 becomes 5, which is natural as the problem has no solution, but it takes one lifetime to get this answer from GLPK. If SCIP can get it quickly, that's just a miracle :-D

Another problem of the same kind :

n = 7
graphs.Grid2dGraph(n,n).disjoint_routed_paths([((0,0), (n-1,n-1)), ((0, n-1), (n-1, 0))])

This already takes 10 seconds with CPLEX. Try n=8 if you are patient !

The answer is also an infeasibility, and the solvers are having a hard time proving it.. But I guess it just comes from my bad LP formulation ^^;

A different type of LP formulation, on random graphs :

from sage.graphs.graph_coloring import edge_coloring
%timeit edge_coloring(graphs.RandomGNP(30,.3), solver = "GLPK", verbose = 2)
5 loops, best of 3: 4.47 s per loop

Nathann

comment:29 follow-up: Changed 8 years ago by malb

On my machine:

sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "GLPK", verbose=0)
CPU times: user 2.81 s, sys: 0.00 s, total: 2.81 s
Wall time: 2.82 s
{0: [(1, 3), (1, 2), (1, 1), (2, 3), (2, 4)], 1: [(3, 0), (1, 0), (3, 1), (2, 0)], 2: [(2, 1)], 3: [(4, 4), (2, 2), (3, 2), (4, 2), (4, 3), (3, 4)]}

It seems there's still a bug in my SCIP interface:

sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "SCIP", verbose=0)
CPU times: user 1.88 s, sys: 0.04 s, total: 1.92 s
Wall time: 1.94 s
{0: [(0, 0), (0, 1)], 1: [(1, 1)], 2: [(2, 1), (2, 2), (1, 0), (2, 0)], 3: [(1, 2), (0, 2)]}

comment:30 in reply to: ↑ 29 Changed 8 years ago by ncohen

Replying to malb:

It seems there's still a bug in my SCIP interface:

Why so ? The answer is not unique ! :-)

Nathann

comment:31 Changed 8 years ago by malb

That's a critical piece of information I missed :)

comment:32 Changed 8 years ago by ncohen

  • Status changed from needs_work to needs_review

Plus CPLEX, plus Coin ! :-)

Nathann

comment:33 Changed 8 years ago by ncohen

  • Description modified (diff)

comment:34 follow-up: Changed 8 years ago by malb

The patch looks good, I'm running long doctests on sage.math now. Can I assume you ran doctests for CPLEX and Coin (i.e. with the optional packages installed)? I think the only remaining issue is:

sage: p = MixedIntegerLinearProgram()
sage: p.new_variable(name="foo")                                                                                                                                                                               
MIPVariable of dimension 1.                                                                                                                                                                                    
sage: x = _
sage: p.add_constraint(0 <= x[1] <= 2)
sage: p.show()
Maximization:
  
Constraints:
  x_0 <= 2.0
Variables:
  x_0 is a real variable (min=0.0, max=+oo)

For some reason MIPVariable ignores the optional name parameter and hardcodes "x". Is there a reason for this?

comment:35 in reply to: ↑ 34 Changed 8 years ago by ncohen

Is there a reason for this?

Not really. Technically, you will find this 'x' inside of the repr method in LinearFunction?, which does not care at all whether the variable has a name or not (it does not even know the MIPVariable object itself, just its id in the LP, and the MIPVariable itself does not know its name (only the LP does). All of is has to be rewritten, it is ugly and inefficient code, but would like to find a way to deal with this that does not make every LP waste time on totally useless name-related operations.

I still think we should just remove anything in these classes which is name-related for the same eternal reasons. The code is ugly, hard to understand, it slows the class down, and it is totally useless for solving LP anyway :-/

Nathann

comment:36 Changed 8 years ago by ncohen

Oh, and I of course checked all the patches on CPLEX and Coin each time. All tests do not pass, because of old bugs :

  • Coin does not support names, so some doctests in mip.py fail because of that
  • for some reason the variable_min/max command on CPLEX variables that are not bounded return a BIG NUMBER and not None when the variable is not bounded (the doctest of a p.show() in mip.py fails because of that)
  • There is a doctest in generic_graph that fails when one uses LP and Coin to solve a max_flow problem on graphs whose edges are randomly weighted. Coin sometimes answers 8.16540000001 instead of 8.1654 and the tests is an equality.

In the end, the remaining bugs (and they have been there for quite a while, but I did not really worry....) are all about displaying stuff. One is about numerical noise on a very specific case (Coin + Flows. One also has to know that by default Sage uses a Cython implementation of the Ford-Fulkerson algorithm to compute a flow, and not the LP unless asked otherwise -- I made the docstring of Graph.flow mention this noise).

Short of this, all tests pass in graph.py, digraph.py and generic_graph.py. So all the methods there do not see the difference between GLPK, Coin or CPLEX (except for the time spent testing the docstrings) ;-)

Would it be possible to address them in a future ticket ? And could I please have your thoughts about removing all these names from the class ? :-)

Nathann

comment:37 follow-up: Changed 8 years ago by malb

Would it be possible to address them in a future ticket ?

Agreed.

And could I please have your thoughts about removing all these names from the class ? :-)

I'm unsure about this. But it is certainly undesirable to allow "name" as a parameter in new_variable() and then to ignore it. My main motivation for keeping names around would be to make debugging easier: I can check whether my mixed integer program looks like I want it to look without getting confused about variable name matching.

comment:38 in reply to: ↑ 37 Changed 8 years ago by ncohen

I'm unsure about this. But it is certainly undesirable to allow "name" as a parameter in new_variable() and then to ignore it. My main motivation for keeping names around would be to make debugging easier: I can check whether my mixed integer program looks like I want it to look without getting confused about variable name matching.

It can be very useful like that for sure.... But I really can not stand that the performances of a solver could somehow suffer because of something which is not used 99% of the time : when the LP are known to work. I really hope this could be fixed soon :-/

Nathann

comment:39 Changed 8 years ago by malb

How does performance suffer exactly? Names aren't used except for showing and writing lp files?

comment:40 Changed 8 years ago by ncohen

names aren't used except in these cases, but when the mip.py file will have been rewritten any creation of variable will mean storing+concatenating names, any call to add_variables/constraints already creates variables that are to receive potential names. I still do not know the loss between calls of cdef methods and cpdef, and one of the reasons why we have so many optional arguments is to deal with names. And so much in the backends is about names (I can not stand to look at code when I know it is never used. I immediately want to remove it)

I mean, all the LP interface does is create constraints and variables. When you add name considerations to all of these things, you just add time to earn nothing. I will have fun profiling all this when it will be in to check what we could earn by removing them or writing it more smartly. Perhaps I'm just complaining for nothing as usual... Even though cutting the code by half seems tempting just for its sake :-D

Nathann

comment:41 Changed 8 years ago by malb

I don't care much about writing LP files but printing of MIPVariables (p.show()) should allow custom names. I'm not saying it should be supported in the backends but the frontend should. It's just the level of convenience users will expect.

comment:42 Changed 8 years ago by ncohen

Here is another patch to fix the .show method and the names in general. Even the .lp files exported through Sage bear the good names !

I also fixed the CPLEX problem with infinite bounds, so that all test pass with CPLEX too !

Nathann

comment:43 Changed 8 years ago by malb

You accidentally hardcoded GLPK it seems in generic_backend.pyx?

  solver = "GLPK" 

comment:44 Changed 8 years ago by ncohen

Gloops.... Thank you for spotting it, this would have been nasty ^^;

I use that line to test LP with GLPK even though I have CPLEX installed....

Patch updated !

Nathann

comment:45 Changed 8 years ago by malb

Do doctests sill pass with CPLEX?

comment:46 Changed 8 years ago by ncohen

Of course of course !! I always do a last check with GLPK when I am done with the patch, as it is the "most important" solver for Sage :-)

Nathann

comment:47 Changed 8 years ago by malb

  • Reviewers set to Martin Albrecht, Nathann Cohen
  • Status changed from needs_review to positive_review

Cool, positive review then!

comment:48 Changed 8 years ago by jdemeyer

  • Milestone changed from sage-4.6.1 to sage-4.6.2

comment:49 Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_info

Why is there a trac_10431-part5.patch? Should it be applied?

comment:50 Changed 8 years ago by ncohen

  • Description modified (diff)
  • Status changed from needs_info to needs_work

It should ! I forgot to update the ticket's description :-)

Nathann

comment:51 Changed 8 years ago by ncohen

  • Status changed from needs_work to positive_review

comment:52 Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_work

Please fix the commit messages to contain the correct ticket number 10341 on the first line.

Changed 8 years ago by ncohen

Changed 8 years ago by ncohen

comment:53 Changed 8 years ago by ncohen

  • Status changed from needs_work to positive_review

Done ! Sorry about that !

Nathann

comment:54 Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_work

Commit message of first patch still needs to be fixed.

comment:55 Changed 8 years ago by ncohen

  • Status changed from needs_work to positive_review

OOps....

I just uploaded a copy of this one (I can not overwrite it) ^^;

Nathann

comment:56 follow-up: Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_work

Please check Sphinx syntax:

docstring of sage.numerical.mip.MixedIntegerLinearProgram.add_constraint:72: (WARNING/2) Literal block expected; none found.

comment:57 in reply to: ↑ 56 Changed 8 years ago by ncohen

  • Description modified (diff)
  • Status changed from needs_work to positive_review

Replying to jdemeyer:

Please check Sphinx syntax:

Right.... Here it is !

Nathann

Changed 8 years ago by ncohen

comment:58 Changed 8 years ago by jdemeyer

  • Merged in set to sage-4.6.2.alpha3
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.