Opened 10 years ago

Last modified 10 years ago

#11606 closed enhancement

simplify constraints in linear programs — at Version 20

Reported by: john_perry Owned by: ncohen
Priority: major Milestone: sage-5.0
Component: linear programming Keywords: sd32
Cc: ncohen Merged in:
Authors: john_perry Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by john_perry)

MixedIntegerLinearProgram? doesn't notice when it is given constraints that already exist in the program, or that are constant multiples. A simple example:

sage: lp = MixedIntegerLinearProgram()
sage: for each in xrange(10):
....:     lp.add_constraint(lp[0]-lp[1],min=1)
....:     
sage: lp.show()
Maximization:
 
Constraints:
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
  1.0 <= x_0 -x_1 
Variables:
  x_0 is a continuous variable (min=0.0, max=+oo)
  x_1 is a continuous variable (min=0.0, max=+oo)

Notice that the same constraint appears 10 different times.

Apply:

This patch depends on #11588.

Change History (22)

comment:1 Changed 10 years ago by john_perry

To emphasize the need for this, in a sample application I'm working on, the number of constraints is decreased by an order of magnitude (93 vs. 787); in larger systems the different would likely be much larger.

Changed 10 years ago by john_perry

comment:2 Changed 10 years ago by john_perry

  • Authors set to john_perry
  • Status changed from new to needs_review

The attached file is a first attempt at solving the problem. An optional argument, check_redundant, which defaults to False, is used to control whether to invoke the change. If so:

  • it proceeds row by row of the constraint matrix
  • it checks whether the same variables are nonzero
  • if so, it computes the ratio r between the coefficients of the "leading" variables
  • if it is negative, add the constraint

(if r is negative, the constraint is not so obviously redundant)

  • if the ratio of the lower & upper bounds disagrees with r, add the constraint
  • now check the ratio of the remaining variables in the current row; if it differs from r, add the constraint

This is meant to be a preliminary implementation rather than an efficient implementation; the best option is probably to visit the matrix & compare, but this would require going into each back end, which is probably a bad idea right now.

comment:3 Changed 10 years ago by kcrisman

  • Cc ncohen added

comment:4 Changed 10 years ago by john_perry

  • Keywords sd32 added

comment:5 follow-ups: Changed 10 years ago by ncohen

  • Status changed from needs_review to needs_info

Hello John !

I was thinking again about what I told you by email : in order to simplify constraints more efficiently, one could add the constraints to a Set object when adding it to the solvers, and add a new constraint only when it is not already included in the set. The search is then much more efficient, and handled by Python. If you also want to test for constraints which may be formaly different (i.e. the original constraint multiplied by two), you but have to insert the "normalized" constraint in the set instead of the one which is given.

This would mean doubling in memory the size of the stored constraints, as it would mean that the LP constraint's are remembered both by the backend and by a Set object in the MILP class, but this is also what is happening right now when returning the list of all constraints. This could also be rewritten as an interator to avoid the cost of memory, then again have no idea how large your LP can be, but perhaps if you have many constraints this Set trick would help you much more.

In this case, the simplify_constraints would be a parameter of the MixedntegerLinearProgram? init method, so that the Set object is created and maintained from beginning to end when the user asks the constraints to be simplified.

Well, my two cents :-)

athann

comment:6 in reply to: ↑ 5 ; follow-up: Changed 10 years ago by john_perry

Replying to ncohen:

If you also want to test for constraints which may be formaly different (i.e. the original constraint multiplied by two), you but have to insert the "normalized" constraint in the set instead of the one which is given.

I had thought of this, too. I think the only reason I didn't do it at the time is a law I've seen attributed to David Knuth: premature optimization is the root of all evil. I can certainly try it.

This could also be rewritten as an interator to avoid the cost of memory...

I'm not quite following how to do this.

then again have no idea how large your LP can be...

Hard data: a problem that exhausted memory on a 4GB machine a month ago computed yesterday in about 40MB.

In this case, the simplify_constraints would be a parameter of the MixedntegerLinearProgram? init method, so that the Set object is created and maintained from beginning to end when the user asks the constraints to be simplified.

I'll try that, thanks.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 10 years ago by ncohen

This could also be rewritten as an interator to avoid the cost of memory...

I'm not quite following how to do this.

Oh. For the moment your patch returns a list of tuples, representing the constraints. Instead of that (which requires to build the whole list first, then to return it, thus doubling the memory cost), python can return the list element by element using the "yield" keyword. This would avoid doubling the memory, especially if you just want to check something for each constraint returned.

comment:8 in reply to: ↑ 7 Changed 10 years ago by john_perry

Replying to ncohen:

This could also be rewritten as an interator to avoid the cost of memory...

I'm not quite following how to do this.

Oh. For the moment your patch returns a list of tuples, representing the constraints. Instead of that (which requires to build the whole list first, then to return it, thus doubling the memory cost), python can return the list element by element using the "yield" keyword. This would avoid doubling the memory, especially if you just want to check something for each constraint returned.

Are you thinking of #11607 here? This patch only affects the return value in one place; it returns None, and then only if the constraint is redundant.

comment:9 in reply to: ↑ 5 Changed 10 years ago by john_perry

Replying to ncohen:

I was thinking again about what I told you by email : in order to simplify constraints more efficiently, one could add the constraints to a Set object when adding it to the solvers, and add a new constraint only when it is not already included in the set. The search is then much more efficient, and handled by Python. If you also want to test for constraints which may be formaly different (i.e. the original constraint multiplied by two), you but have to insert the "normalized" constraint in the set instead of the one which is given.

I have a working implementation along these lines, but there's one difficulty. In order to normalize the constraint, the code divides by the coefficient of the variable with the smallest index. For example, given the constraint 2*x_1-2*x_0<=2, the code converts this to x_1-x_0<=1.

In order to do this, I want to use the min() command to determine the smallest non-zero key, determine its coefficient, and divide:

    i = min(f.keys())
    c = f[i]
    C = [(v,coeff/c) for (v,coeff) in f.iteritems() if v != -1]

The problem with this is that min is a parameter to the function, so this provokes an error.

I can rename the parameters as min_value and max_value, but this would break people's existing code. Alternately, I can define self._min = min in __init__, and call self._min() where I would ordinarily call min().

I am attaching a patch that uses the latter approach, but if you have suggestions for a better solution, I'm all ears. It wouldn't surprise me if there is one.

BTW, does it matter if I use Set or set? You wrote the former, but currently I'm using the latter.

comment:10 Changed 10 years ago by john_perry

  • Description modified (diff)

comment:11 Changed 10 years ago by john_perry

  • Status changed from needs_info to needs_review

More rigorous testing showed that I had overlooked an issue with the keys (variables whose coefficients are zero). Fixed in the revision of the patch.

(Just noticed: it helps if one executes hg qrefresh before hg export tip...)

Changed 10 years ago by john_perry

comment:12 Changed 10 years ago by ncohen

  • Status changed from needs_review to needs_work

Hello John ! :-)

I really didn't think at first this method would need to use either min or max later on :-)

Hopefully, this trick should work :

from __builtin__ import min as min_function

Could you also add in the explanation of the check_redundant parameter what is a "redundant" constraint ? Something like "(two constraints are redundant when one can be obtained from the other scalar by multiplication)" or something of the kind ? :-)

Oh, and it looks like there are some missing "::" before the beginning of Python code in the doctests, so those will not be properly formatted when building the documentation.

sage -docbuild reference html

Two other remarks :

  • Set is a Sage version of set, the latter being a purely Python thing. The interest of Set is that it is hashable and immutable, while the "set" can be modified. When you do not need the elements to be immutable, I'm told the best is to use set. And this I was probably told after writing this part :-)
  • I thought that
    min( x for x in range(3) if x != 0 )
    
    would be faster than
    min([ x for x in range(3) if x != 0 ])
    

as you do not create the list first, but it looks like the iterators have a cost too :

sage: %timeit min( x for x in range(10000) if x != 0 )
125 loops, best of 3: 5.51 ms per loop
sage: %timeit min([ x for x in range(10000) if x != 0 ])
125 loops, best of 3: 4.56 ms per loop
sage: %timeit min(x for x in range(10) if x != 0)
625 loops, best of 3: 7.41 µs per loop
sage: %timeit min([ x for x in range(10) if x != 0 ])
625 loops, best of 3: 6.81 µs per loop

Nathann

comment:13 Changed 10 years ago by john_perry

  • Description modified (diff)

Alright! The new patch imports min via __builtin__, and fixes those issues with the documentation. The doctests take about twice as long to run on my machine now. :-(

Regarding the iterators: now I understand what you mean. Actually, if memory serves (and it may deceive -- not sure) I think I tried that when defining C, but Cython defied me, claiming that generators weren't acceptable. That's when I noticed that you had placed you definition of C in a list. So, I didn't think to use it in the min_function (maybe I even tried).

comment:14 Changed 10 years ago by john_perry

  • Status changed from needs_work to needs_review

comment:15 Changed 10 years ago by john_perry

By the way, when reading the section on docstring markup in Conventions for Coding in Sage, I noticed this:

Warning: Functions whose names start with an underscore do not currently appear in the reference manual, so avoid putting crucial documentation in their docstrings. In particular, if you are defining a class, you might put a long informative docstring after the class definition, not for the __init__ method.

Should we move that documentation out of __init__?

(You have to scroll fairly far down to find the warning, or else do a search.)

comment:16 Changed 10 years ago by john_perry

Forgot about copying -- we would want to copy the flag for checking redundancy, as well as the constraints.

comment:17 Changed 10 years ago by john_perry

Forgot about copying sets, not just assigning pointers...

comment:18 Changed 10 years ago by john_perry

  • Status changed from needs_review to needs_work

Something isn't working correctly with the normalization, sometimes. I'm trying to figure out what changed that is causing the issue I'm seeing...

comment:19 Changed 10 years ago by john_perry

  • Status changed from needs_work to needs_info

As I suspected, it's a typing problem. Normalization performs a division, and 1/2 simplifies to 0 if 1 is an int. If 1 is an Integer or a Rational, then 1/2 simplifies to a Rational.

This strikes me as a matter best left to the client: the normalization shouldn't choose the type of the coefficients of the constraints; it's not even clear we can guess this a priori. So this isn't a bug after all.

But, I'd like someone else's opinion.

comment:20 Changed 10 years ago by john_perry

  • Description modified (diff)

I just noticed that the patch included depends on the patch for #11588 (copying a linear program), and changed the description to reflect this.

Note: See TracTickets for help on using tickets.