#11606 closed enhancement (fixed)
simplify constraints in linear programs
Reported by: | John Perry | Owned by: | Nathann Cohen |
---|---|---|---|
Priority: | major | Milestone: | sage-5.0 |
Component: | linear programming | Keywords: | sd32 |
Cc: | Nathann Cohen | Merged in: | sage-5.0.beta1 |
Authors: | John Perry | Reviewers: | Nathann Cohen |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
MixedIntegerLinearProgram? doesn't notice when it is given constraints that already exist in the program, or that are constant multiples. A simple example:
sage: lp = MixedIntegerLinearProgram() sage: for each in xrange(10): ....: lp.add_constraint(lp[0]-lp[1],min=1) ....: sage: lp.show() Maximization: Constraints: 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 1.0 <= x_0 -x_1 Variables: x_0 is a continuous variable (min=0.0, max=+oo) x_1 is a continuous variable (min=0.0, max=+oo)
Notice that the same constraint appears 10 different times.
Apply:
Attachments (3)
Change History (43)
comment:1 Changed 11 years ago by
Changed 11 years ago by
Attachment: | trac_11606_add_only_new_constraints_to_lp.patch added |
---|
comment:2 Changed 11 years ago by
Authors: | → john_perry |
---|---|
Status: | new → needs_review |
The attached file is a first attempt at solving the problem. An optional argument, check_redundant
, which defaults to False
, is used to control whether to invoke the change. If so:
- it proceeds row by row of the constraint matrix
- it checks whether the same variables are nonzero
- if so, it computes the ratio
r
between the coefficients of the "leading" variables - if it is negative, add the constraint
(if
r
is negative, the constraint is not so obviously redundant)
- if the ratio of the lower & upper bounds disagrees with
r
, add the constraint - now check the ratio of the remaining variables in the current row; if it differs from
r
, add the constraint
This is meant to be a preliminary implementation rather than an efficient implementation; the best option is probably to visit the matrix & compare, but this would require going into each back end, which is probably a bad idea right now.
comment:3 Changed 11 years ago by
Cc: | Nathann Cohen added |
---|
comment:4 Changed 11 years ago by
Keywords: | sd32 added |
---|
comment:5 follow-ups: 6 9 Changed 11 years ago by
Status: | needs_review → needs_info |
---|
Hello John !
I was thinking again about what I told you by email : in order to simplify constraints more efficiently, one could add the constraints to a Set object when adding it to the solvers, and add a new constraint only when it is not already included in the set. The search is then much more efficient, and handled by Python. If you also want to test for constraints which may be formaly different (i.e. the original constraint multiplied by two), you but have to insert the "normalized" constraint in the set instead of the one which is given.
This would mean doubling in memory the size of the stored constraints, as it would mean that the LP constraint's are remembered both by the backend and by a Set object in the MILP class, but this is also what is happening right now when returning the list of all constraints. This could also be rewritten as an interator to avoid the cost of memory, then again have no idea how large your LP can be, but perhaps if you have many constraints this Set trick would help you much more.
In this case, the simplify_constraints would be a parameter of the MixedntegerLinearProgram? init method, so that the Set object is created and maintained from beginning to end when the user asks the constraints to be simplified.
Well, my two cents :-)
athann
comment:6 follow-up: 7 Changed 11 years ago by
Replying to ncohen:
If you also want to test for constraints which may be formaly different (i.e. the original constraint multiplied by two), you but have to insert the "normalized" constraint in the set instead of the one which is given.
I had thought of this, too. I think the only reason I didn't do it at the time is a law I've seen attributed to David Knuth: premature optimization is the root of all evil. I can certainly try it.
This could also be rewritten as an interator to avoid the cost of memory...
I'm not quite following how to do this.
then again have no idea how large your LP can be...
Hard data: a problem that exhausted memory on a 4GB machine a month ago computed yesterday in about 40MB.
In this case, the simplify_constraints would be a parameter of the MixedntegerLinearProgram? init method, so that the Set object is created and maintained from beginning to end when the user asks the constraints to be simplified.
I'll try that, thanks.
comment:7 follow-up: 8 Changed 11 years ago by
This could also be rewritten as an interator to avoid the cost of memory...
I'm not quite following how to do this.
Oh. For the moment your patch returns a list of tuples, representing the constraints. Instead of that (which requires to build the whole list first, then to return it, thus doubling the memory cost), python can return the list element by element using the "yield" keyword. This would avoid doubling the memory, especially if you just want to check something for each constraint returned.
comment:8 Changed 11 years ago by
Replying to ncohen:
This could also be rewritten as an interator to avoid the cost of memory...
I'm not quite following how to do this.
Oh. For the moment your patch returns a list of tuples, representing the constraints. Instead of that (which requires to build the whole list first, then to return it, thus doubling the memory cost), python can return the list element by element using the "yield" keyword. This would avoid doubling the memory, especially if you just want to check something for each constraint returned.
Are you thinking of #11607 here? This patch only affects the return value in one place; it returns None
, and then only if the constraint is redundant.
comment:9 Changed 11 years ago by
Replying to ncohen:
I was thinking again about what I told you by email : in order to simplify constraints more efficiently, one could add the constraints to a Set object when adding it to the solvers, and add a new constraint only when it is not already included in the set. The search is then much more efficient, and handled by Python. If you also want to test for constraints which may be formaly different (i.e. the original constraint multiplied by two), you but have to insert the "normalized" constraint in the set instead of the one which is given.
I have a working implementation along these lines, but there's one difficulty. In order to normalize the constraint, the code divides by the coefficient of the variable with the smallest index. For example, given the constraint 2*x_1-2*x_0<=2
, the code converts this to x_1-x_0<=1
.
In order to do this, I want to use the min()
command to determine the smallest non-zero key, determine its coefficient, and divide:
i = min(f.keys()) c = f[i] C = [(v,coeff/c) for (v,coeff) in f.iteritems() if v != -1]
The problem with this is that min
is a parameter to the function, so this provokes an error.
I can rename the parameters as min_value
and max_value
, but this would break people's existing code. Alternately, I can define self._min = min
in __init__
, and call self._min()
where I would ordinarily call min()
.
I am attaching a patch that uses the latter approach, but if you have suggestions for a better solution, I'm all ears. It wouldn't surprise me if there is one.
BTW, does it matter if I use Set
or set
? You wrote the former, but currently I'm using the latter.
comment:10 Changed 11 years ago by
Description: | modified (diff) |
---|
comment:11 Changed 11 years ago by
Status: | needs_info → needs_review |
---|
More rigorous testing showed that I had overlooked an issue with the keys (variables whose coefficients are zero). Fixed in the revision of the patch.
(Just noticed: it helps if one executes hg qrefresh
before hg export tip
...)
Changed 11 years ago by
Attachment: | trac_11606_simplify_with_sets.patch added |
---|
comment:12 Changed 11 years ago by
Status: | needs_review → needs_work |
---|
Hello John ! :-)
I really didn't think at first this method would need to use either min or max later on :-)
Hopefully, this trick should work :
from __builtin__ import min as min_function
Could you also add in the explanation of the check_redundant parameter what is a "redundant" constraint ? Something like "(two constraints are redundant when one can be obtained from the other scalar by multiplication)" or something of the kind ? :-)
Oh, and it looks like there are some missing "::" before the beginning of Python code in the doctests, so those will not be properly formatted when building the documentation.
sage -docbuild reference html
Two other remarks :
- Set is a Sage version of set, the latter being a purely Python thing. The interest of Set is that it is hashable and immutable, while the "set" can be modified. When you do not need the elements to be immutable, I'm told the best is to use set. And this I was probably told after writing this part
:-)
- I thought that
min( x for x in range(3) if x != 0 )
would be faster thanmin([ x for x in range(3) if x != 0 ])
as you do not create the list first, but it looks like the iterators have a cost too :
sage: %timeit min( x for x in range(10000) if x != 0 ) 125 loops, best of 3: 5.51 ms per loop sage: %timeit min([ x for x in range(10000) if x != 0 ]) 125 loops, best of 3: 4.56 ms per loop sage: %timeit min(x for x in range(10) if x != 0) 625 loops, best of 3: 7.41 µs per loop sage: %timeit min([ x for x in range(10) if x != 0 ]) 625 loops, best of 3: 6.81 µs per loop
Nathann
comment:13 Changed 11 years ago by
Description: | modified (diff) |
---|
Alright! The new patch imports min
via __builtin__
, and fixes those issues with the documentation. The doctests take about twice as long to run on my machine now. :-(
Regarding the iterators: now I understand what you mean. Actually, if memory serves (and it may deceive -- not sure) I think I tried that when defining C
, but Cython defied me, claiming that generators weren't acceptable. That's when I noticed that you had placed you definition of C
in a list. So, I didn't think to use it in the min_function
(maybe I even tried).
comment:14 Changed 11 years ago by
Status: | needs_work → needs_review |
---|
comment:15 Changed 11 years ago by
By the way, when reading the section on docstring markup in Conventions for Coding in Sage, I noticed this:
Warning: Functions whose names start with an underscore do not currently appear in the reference manual, so avoid putting crucial documentation in their docstrings. In particular, if you are defining a class, you might put a long informative docstring after the class definition, not for the
__init__
method.
Should we move that documentation out of __init__
?
(You have to scroll fairly far down to find the warning, or else do a search.)
comment:16 Changed 11 years ago by
Forgot about copying -- we would want to copy the flag for checking redundancy, as well as the constraints.
comment:18 Changed 11 years ago by
Status: | needs_review → needs_work |
---|
Something isn't working correctly with the normalization, sometimes. I'm trying to figure out what changed that is causing the issue I'm seeing...
comment:19 Changed 11 years ago by
Status: | needs_work → needs_info |
---|
As I suspected, it's a typing problem. Normalization performs a division, and 1/2
simplifies to 0
if 1
is an int
. If 1
is an Integer
or a Rational
, then 1/2
simplifies to a Rational
.
This strikes me as a matter best left to the client: the normalization shouldn't choose the type of the coefficients of the constraints; it's not even clear we can guess this a priori. So this isn't a bug after all.
But, I'd like someone else's opinion.
comment:20 Changed 11 years ago by
Description: | modified (diff) |
---|
I just noticed that the patch included depends on the patch for #11588 (copying a linear program), and changed the description to reflect this.
comment:21 Changed 11 years ago by
Description: | modified (diff) |
---|
Yep, depends on #11607, too. That can be removed, and I'll do so in the near future.
comment:22 follow-up: 23 Changed 11 years ago by
but Cython defied me, claiming that generators weren't acceptable.
Oops, that's right. The current version of Cython doesn't like generators, this feature should be included in the next Cython release, and I assure you that we have been waiting for generators in Cython for a loooooooong time here. There are patches I haven't written yet \ waiting for that, others that contain as a comment "Change this as soon as we have iterators".. Sigh :-)
Should we move that documentation out of init?
Well, the current documentation of the init method of the MixedIntegerLinearProgram? class also appears just before, in the section's documentation (and that appears in the doc). But you are right : it's probably best to remove it just to avoid later mistakes.
Actually, it should just contain the line "Constructor" and the doctests. The command
sage -coverage file.pyx
is a statistic telling you whether all the functions have documentation and tests. It's nice to have around, but that's also why this init function should have a minimal set of doctests -- so that the output of this software doesn't report that some doc is missing :-)
This strikes me as a matter best left to the client: the normalization shouldn't choose the type of the coefficients of the constraints
You are right in general, though in this special situation the answer has been taken from us. When a constraint is added, it is forwarded to the solver backend which is the only place where it is saved. Now, all of CPLEX/GLPK/Coin only accept floats as data. As mip.pyx is\
a Cython file, the best is probably to deal with float variables. By the way, why do you create so many lists ? Instead of
i = min_function([v for (v,coeff) in f.iteritems() if coeff != 0]) c = f[i]
What about
cdef int i = 0 cdef float c for i, c in f.iteritems(): if c!= 0: break
Sorry about the delay again. I'm getting back to the point where I can deal with tasks on-the-fly, and not with the ones that should have been settled the week before :-)
Nathann
comment:23 follow-up: 24 Changed 11 years ago by
Replying to ncohen:
Well, the current documentation of the init method of the MixedIntegerLinearProgram? class also appears just before, in the section's documentation (and that appears in the doc). But you are right : it's probably best to remove it just to avoid later mistakes.
Actually, it should just contain the line "Constructor" and the doctests.
Will do.
This strikes me as a matter best left to the client: the normalization shouldn't choose the type of the coefficients of the constraints
You are right in general, though in this special situation the answer has been taken from us. When a constraint is added, it is forwarded to the solver backend which is the only place where it is saved. Now, all of CPLEX/GLPK/Coin only accept floats as data. As mip.pyx is a Cython file, the best is probably to deal with float variables.
Are you saying I should coerce everything to floats, or leave it be? I thought to leave it be in case some future back end accepted rationals; the backends can do their own coercion.
By the way, why do you create so many lists ?
You seem to think I know something about proper P/Cython programming. :-)
Thanks for the suggestion; that will help speed things up a bit, though I think my real problem lies elsewhere.
Sorry about the delay again. I'm getting back to the point where I can deal with tasks on-the-fly, and not with the ones that should have been settled the week before :-)
No problem. I'm in a bit of a fix myself, so it may take a few days before another patch is ready, anyway.
By the way, do you know what those colored balls next to "opened: ... weeks ago" mean?
comment:24 follow-up: 25 Changed 11 years ago by
Are you saying I should coerce everything to floats, or leave it be? I thought to leave it be in case some future back end accepted rationals; the backends can do their own coercion.
Well, I think the doubles wouldn't hurt. And I guess it would be the same for rationals if it gets implemented. The constraints in the LP would not change, all that matters is that equivalent constraints get coerced to the same values :-)
that will help speed things up a bit, though I think my real problem lies elsewhere.
Yep, unfortunately :-)
By the way, do you know what those colored balls next to "opened: ... weeks ago" mean?
Oh. Did you try clicking on them ? The patches uploaded on a trac ticket are automatically applied and tested on a Sage install, and this ball prints the status of this operation. Browse the other tickets waiting for review and see how it changes ;-)
Nathann
comment:25 Changed 11 years ago by
Replying to ncohen:
Oh. Did you try clicking on them ?
I did, and I don't seem any status of an automatic patch and test. It seems only to summarize the data. Then there's a long horizontal line & white space.
The only thing I've noticed is that different patches sometimes have different colors.
comment:27 Changed 11 years ago by
comment:28 Changed 11 years ago by
(On this ticket the circle is yellow and you will have more complete information if yo click on it)
comment:29 Changed 11 years ago by
Status: | needs_info → needs_review |
---|
The new patch reflects changes described above.
comment:30 Changed 11 years ago by
The only change was to rebase the patch against the revised #11607.
comment:32 Changed 11 years ago by
Replying to john_perry:
*bump* any chance of a review?
Rightrightrightright. You did well :-D
Nathann
comment:33 Changed 11 years ago by
Status: | needs_review → needs_work |
---|
Hellooooooo again !
I have less than 30 minutes of battery left but it should be ok :-D
So. First, and before anything else, this patch needs to be rebased on the current version of Sage. This patch defines two methods (constraints
and
number of contraints
) which have been added by another patch already, so Sage refuses to run when this patch is applied.
Short of this, the patch looks good, so once this is settled it should be good to go :-)
Nathann
Changed 11 years ago by
Attachment: | trac_11606_add_only_new_constraints_to_lp_using_sets.patch added |
---|
comment:34 Changed 11 years ago by
Status: | needs_work → needs_review |
---|
I have rebased the patch against sage-4.8.alpha4
. This patch no longer introduces the function constraints()
that existed in the previous patch, and that caused the conflict. It does not provoke any doctest failures on my version of sage-4.8.alpha4
.
However, I did notice that three doctest failures currently occur with sage-4.8.alpha4
:
File "/Applications/sage-4.8.alpha4/devel/sage-main/sage/numerical/mip.pyx", line 523: sage: p.show() Expected: ... File "/Applications/sage-4.8.alpha4/devel/sage-main/sage/numerical/mip.pyx", line 685: sage: p.write_lp(SAGE_TMP+"/lp_problem.lp") Exception raised: ... File "/Applications/sage-4.8.alpha4/devel/sage-main/sage/numerical/mip.pyx", line 322: sage: p Expected: ...
Since these exist in an unpatched sage-4.8.alpha4
, they have no bearing on this ticket, but is this a known issue? I'll open a new ticket if not. My machine is a MacBook? Pro running OSX 10.6.8.
comment:35 Changed 11 years ago by
Authors: | john_perry → John Perry |
---|---|
Reviewers: | → Nathann Cohen |
Status: | needs_review → positive_review |
Positive review to this patch, which passes all tests ! :-)
John, I do not see the errors you mentionned on my alpha4. The fact that the write_lp and p.show return errors lead me to think that you have CBC installed. Is this true ? :-)
Through OsiCbcSolverInterface?, I did not find how to write those files and some "names" in the structure of a LP were not available (like the "name of the lp", or the "name of the objective function"), and because of that I did not enable the names in the CBc interface. Perhaps this is one of the things that would be fixed by a real interface with Cbc. But anyway, I guess those bugs only appear on your install because of Cbc. They can also be "fixed" by adding "solver= GLPK"in the doctests above.
Thank you for this patch ! :-)
Nathann
comment:36 Changed 11 years ago by
Milestone: | sage-4.8 → sage-5.0 |
---|
comment:37 follow-up: 38 Changed 11 years ago by
Weird. I saw jdemeyer's change yesterday, but not ncohen's change two days ago.
John, I do not see the errors you mentionned on my alpha4. The fact that the write_lp and p.show return errors lead me to think that you have CBC installed. Is this true ? :-)
Bingo!
As I see, Cbc is a real problem for us, even though it seems good for me personally... then again, maybe not if I look at it long enough :-(
comment:38 Changed 11 years ago by
As I see, Cbc is a real problem for us, even though it seems good for me personally... then again, maybe not if I look at it long enough :-(
Well, if you find out how to solve a MIP using Coin's library without using OsiCbcSolverInterface?, just let me know. This solver is driving me mad :-p
Nathann
comment:39 Changed 11 years ago by
Merged in: | → sage-5.0.beta1 |
---|---|
Resolution: | → fixed |
Status: | positive_review → closed |
comment:40 Changed 11 years ago by
Apply trac_11606_add_only_new_constraints_to_lp_using_sets.patch
(for legacy patchbots running 4.8)
To emphasize the need for this, in a sample application I'm working on, the number of constraints is decreased by an order of magnitude (93 vs. 787); in larger systems the different would likely be much larger.