id,summary,reporter,owner,description,type,status,priority,milestone,component,resolution,keywords,cc,merged,author,reviewer,upstream,work_issues,branch,commit,dependencies,stopgaps
18055,Improve lookahead in IntegerListsLex,nthiery,,"`IntegerListsLex` uses a depth first search in a prefix tree, with lookahead to prune branches.
The purpose of this ticket is to improve the lookahead algorithm in two directions:
- Detect more (all?) branches that can be cut.
- Better time complexity; this probably will involve identifying important invariants and caching them.
Example:: even just for n=1000 this is taking a long time:
{{{
sage: n = 100
sage: IntegerListsLex(binomial(n,2)-1, length=n, min_slope=1).list()
}}}
The following also take a long time, which should be optimized:
{{{
sage: IntegerListsLex(10^100, max_length=1).list()
sage: IntegerListsLex(499499, length=1000, min_slope=1).list()
}}}
One could hope that one could reach an average complexity roughly
linear in `O(log(max_n), max_length)` per list generated.
Concrete steps:
- If we want the floor function to satisfy the `max_slope` constraint,
we need to do ""reverse smoothing"", and the floor function then
depends on the interval `[0,...,k)` over which we want to consider
it. The same holds for the ceiling function if we want it to satisfy
the `min_slope` constraint. Let thus `f_k` and `c_k` be respectively
the floor and ceiling functions for the interval `[0,...k)`
Let's ignore the constraints on the length and on `n` for now.
Then, `f_k` and `c_k` are valid lists. Prove the following remark:
For any `n` between `sum(f_k)` and `sum(c_k)` there exists a valid
list of length `k` and sum `n`.
Putting back the length and sum constraints, there exists a valid
list if and only if the interval `I_k = [sum(f_k), sum(c_k)]`
intersects the interval `[min_n, max_n]` for some `k` between
`min_length` and `max_length`.
- Extend `Envelope` with a method `.sum(k)` which computes `sum(f_k)`.
It should have appropriate algorithmic and caching so that the
overall time complexity of computing `sum(f_k)` for all `k<=l` is
linear in `l`.
Trick: store, for each `j`, the smallest `i<=j` such that between
`i` and `j` the slope is exactly `max_slope`. The point is that
computing the sum of the parts of `f_k` between `i` and `j` is
constant time. Use that incrementally to derive quickly `sum(f_k)`
from `sum(f_j)` for some `j<=k`.
Suggestion for a faster development cycle: experiment with this by
extracting the current code for `Envelope` in a separate file.
- Use the above characterization and the `.sum(k)` methods to
implement a method `IntegerListsLex.is_empty` which terminates with
guaranteed result on non degenerate cases. This is essentially what
the current `lookahead` method does, but starting from `0` and with
guaranteed results thanks to the reverse smoothing.
Determine precisely the degenerate cases where `is_empty` may take
an arbitrarily long time (something like: the ceiling can be `0` for
an arbitrary long time, without a known `limit=0` and `limit_start`).
Decide on appropriate metrics to bound tightly the complexity in the
other cases (something around `max_length`, `max_n`, `limit_start`).
As a side product, we probably want to store for later usage the
information of which `n` admit a valid list, as a collection of non
overlapping intervals. Then, deciding whether there exists a valid
list of sum `n` for a given `n` can be achieved efficiently by
dichotomy.
- Assuming that `l` is a prefix of a valid list, find out how to
determine for which `i` `l+[i]` is still a prefix of a valid list.
Potential building block: the choice of a prefix imposes more
constraints on the floor and ceiling later on (our former `adapt`
method), and we want to keep track of that. Given a prefix `l`,
write `f_k(l)` be the list obtained by extending `l` to a list of
length `k` using the floor `f_k`, with appropriate forward
smoothing. Same for `c_k(l)`. Write `I_k(l) = [sum(f_k(l)),
sum(c_k(l))]`.
Now, `l` is a prefix of a valid list if one of the `I_k(l)`
intersect `[min_sum, max_sum]`. So, while increasing the length of
`l` we want to do something like incrementally adapting the sequence
of intervals `I_k(l)` to always make sure we stay within one of
them.
- Use this to reimplement `lookahead(self)` implementing the above,
with a guaranteed result and good complexity. And if at all possible
a constant time complexity when the constraints are simple
(e.g. `floor=0`, ...).
This method should support `_current_list=[]` as well.
- Update `next` accordingly (the logic should be slightly simpler
thanks to the handling of the empty prefix).
- If not implemented elsewhere: let the constructor accept a prefix as
input, to start the iteration from there. Derive a `next` method as
a side effect.
- Remove `integer_list_old` at the final stage.
",enhancement,new,major,sage-6.6,combinatorics,,,bgillespie,,,,N/A,,,,#18109,