Opened 4 years ago

# Improve lookahead in IntegerListsLex — at Version 10

Reported by: Owned by: nthiery major sage-6.6 combinatorics bgillespie N/A #18109

`IntegerListsLex` uses a depth first search in a prefix tree, with lookahead to prune branches. The purpose of this ticket is to improve the lookahead algorithm in two directions:

• Detect more (all?) branches that can be cut.
• Better time complexity; this probably will involve identifying important invariants and caching them.

Example:: even just for n=1000 this is taking a long time:

```    sage: n = 100
sage: IntegerListsLex(binomial(n,2)-1, length=n, min_slope=1).list()
```

The following also take a long time, which should be optimized:

```    sage: IntegerListsLex(10^100, max_length=1).list()
sage: IntegerListsLex(499499, length=1000, min_slope=1).list()
```

One could hope that one could reach an average complexity roughly linear in `O(log(max_n), max_length)` per list generated.

Concrete steps:

• If we want the floor function to satisfy the `max_slope` constraint, we need to do "reverse smoothing", and the floor function then depends on the interval `[0,...,k)` over which we want to consider it. The same holds for the ceiling function if we want it to satisfy the `min_slope` constraint. Let thus `f_k` and `c_k` be respectively the floor and ceiling functions for the interval `[0,...k)`

Let's ignore the constraints on the length and on `n` for now. Then, `f_k` and `c_k` are valid lists. Prove the following remark:

For any `n` between `sum(f_k)` and `sum(c_k)` there exists a valid list of length `k` and sum `n`.

Putting back the length and sum constraints, there exists a valid list if and only if the interval `I_k = [sum(f_k), sum(c_k)]` intersects the interval `[min_n, max_n]` for some `k` between `min_length` and `max_length`.

• Extend `Envelope` with a method `.sum(k)` which computes `sum(f_k)`.

It should have appropriate algorithmic and caching so that the overall time complexity of computing `sum(f_k)` for all `k<=l` is linear in `l`.

Trick: store, for each `j`, the smallest `i<=j` such that between `i` and `j` the slope is exactly `max_slope`. The point is that computing the sum of the parts of `f_k` between `i` and `j` is constant time. Use that incrementally to derive quickly `sum(f_k)` from `sum(f_j)` for some `j<=k`.

Suggestion for a faster development cycle: experiment with this by extracting the current code for `Envelope` in a separate file.

• Use the above characterization and the `.sum(k)` methods to implement a method `IntegerListsLex.is_empty` which terminates with guaranteed result on non degenerate cases. This is essentially what the current `lookahead` method does, but starting from `0` and with guaranteed results thanks to the reverse smoothing.

Determine precisely the degenerate cases where `is_empty` may take an arbitrarily long time (something like: the ceiling can be `0` for an arbitrary long time, without a known `limit=0` and `limit_start`).

Decide on appropriate metrics to bound tightly the complexity in the other cases (something around `max_length`, `max_n`, `limit_start`).

As a side product, we probably want to store for later usage the information of which `n` admit a valid list, as a collection of non overlapping intervals. Then, deciding whether there exists a valid list of sum `n` for a given `n` can be achieved efficiently by dichotomy.

• Assuming that `l` is a prefix of a valid list, find out how to determine for which `i` `l+[i]` is still a prefix of a valid list.

Potential building block: the choice of a prefix imposes more constraints on the floor and ceiling later on (our former `adapt` method), and we want to keep track of that. Given a prefix `l`, write `f_k(l)` be the list obtained by extending `l` to a list of length `k` using the floor `f_k`, with appropriate forward smoothing. Same for `c_k(l)`. Write `I_k(l) = [sum(f_k(l)), sum(c_k(l))]`.

Now, `l` is a prefix of a valid list if one of the `I_k(l)` intersect `[min_sum, max_sum]`. So, while increasing the length of `l` we want to do something like incrementally adapting the sequence of intervals `I_k(l)` to always make sure we stay within one of them.

• Use this to reimplement `lookahead(self)` implementing the above, with a guaranteed result and good complexity. And if at all possible a constant time complexity when the constraints are simple (e.g. `floor=0`, ...).

This method should support `_current_list=[]` as well.

• Update `next` accordingly (the logic should be slightly simpler thanks to the handling of the empty prefix).
• If not implemented elsewhere: let the constructor accept a prefix as input, to start the iteration from there. Derive a `next` method as a side effect.
• Remove `integer_list_old` at the final stage.

### comment:1 Changed 4 years ago by aschilling

• Description modified (diff)

### comment:2 Changed 4 years ago by aschilling

• Description modified (diff)

### comment:3 Changed 4 years ago by nthiery

• Description modified (diff)

### comment:4 Changed 4 years ago by nthiery

• Description modified (diff)

### comment:5 follow-up: ↓ 6 Changed 4 years ago by jdemeyer

• Description modified (diff)

Can we please stop talking about "degenerate cases" without specifying what that means?

### comment:6 in reply to: ↑ 5 Changed 4 years ago by nthiery

Can we please stop talking about "degenerate cases" without specifying what that means?

This ticket involves some research, which includes in particular defining precisely what "degenerate" is. Until this research is done, we can't do anything but remain vague.

### comment:7 Changed 4 years ago by nthiery

• Description modified (diff)

### comment:8 Changed 4 years ago by jdemeyer

• Dependencies set to #18181

### comment:9 Changed 4 years ago by jdemeyer

• Dependencies changed from #18181 to #18109

### comment:10 Changed 4 years ago by nthiery

• Description modified (diff)
Note: See TracTickets for help on using tickets.