Opened 4 years ago
Last modified 4 years ago
#18055 new enhancement
Improve lookahead in IntegerListsLex — at Version 10
Reported by: | nthiery | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.6 |
Component: | combinatorics | Keywords: | |
Cc: | bgillespie | Merged in: | |
Authors: | Reviewers: | ||
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | #18109 | Stopgaps: |
Description (last modified by )
IntegerListsLex
uses a depth first search in a prefix tree, with lookahead to prune branches.
The purpose of this ticket is to improve the lookahead algorithm in two directions:
- Detect more (all?) branches that can be cut.
- Better time complexity; this probably will involve identifying important invariants and caching them.
Example:: even just for n=1000 this is taking a long time:
sage: n = 100 sage: IntegerListsLex(binomial(n,2)-1, length=n, min_slope=1).list()
The following also take a long time, which should be optimized:
sage: IntegerListsLex(10^100, max_length=1).list() sage: IntegerListsLex(499499, length=1000, min_slope=1).list()
One could hope that one could reach an average complexity roughly
linear in O(log(max_n), max_length)
per list generated.
Concrete steps:
- If we want the floor function to satisfy the
max_slope
constraint, we need to do "reverse smoothing", and the floor function then depends on the interval[0,...,k)
over which we want to consider it. The same holds for the ceiling function if we want it to satisfy themin_slope
constraint. Let thusf_k
andc_k
be respectively the floor and ceiling functions for the interval[0,...k)
Let's ignore the constraints on the length and on
n
for now. Then,f_k
andc_k
are valid lists. Prove the following remark:
For any
n
betweensum(f_k)
andsum(c_k)
there exists a valid list of lengthk
and sumn
.
Putting back the length and sum constraints, there exists a valid list if and only if the interval
I_k = [sum(f_k), sum(c_k)]
intersects the interval[min_n, max_n]
for somek
betweenmin_length
andmax_length
.
- Extend
Envelope
with a method.sum(k)
which computessum(f_k)
.
It should have appropriate algorithmic and caching so that the overall time complexity of computing
sum(f_k)
for allk<=l
is linear inl
.
Trick: store, for each
j
, the smallesti<=j
such that betweeni
andj
the slope is exactlymax_slope
. The point is that computing the sum of the parts off_k
betweeni
andj
is constant time. Use that incrementally to derive quicklysum(f_k)
fromsum(f_j)
for somej<=k
.
Suggestion for a faster development cycle: experiment with this by extracting the current code for
Envelope
in a separate file.
- Use the above characterization and the
.sum(k)
methods to implement a methodIntegerListsLex.is_empty
which terminates with guaranteed result on non degenerate cases. This is essentially what the currentlookahead
method does, but starting from0
and with guaranteed results thanks to the reverse smoothing.
Determine precisely the degenerate cases where
is_empty
may take an arbitrarily long time (something like: the ceiling can be0
for an arbitrary long time, without a knownlimit=0
andlimit_start
).
Decide on appropriate metrics to bound tightly the complexity in the other cases (something around
max_length
,max_n
,limit_start
).
As a side product, we probably want to store for later usage the information of which
n
admit a valid list, as a collection of non overlapping intervals. Then, deciding whether there exists a valid list of sumn
for a givenn
can be achieved efficiently by dichotomy.
- Assuming that
l
is a prefix of a valid list, find out how to determine for whichi
l+[i]
is still a prefix of a valid list.
Potential building block: the choice of a prefix imposes more constraints on the floor and ceiling later on (our former
adapt
method), and we want to keep track of that. Given a prefixl
, writef_k(l)
be the list obtained by extendingl
to a list of lengthk
using the floorf_k
, with appropriate forward smoothing. Same forc_k(l)
. Write `I_k(l) = [sum(f_k(l)), sum(c_k(l))]`.
Now,
l
is a prefix of a valid list if one of theI_k(l)
intersect[min_sum, max_sum]
. So, while increasing the length ofl
we want to do something like incrementally adapting the sequence of intervalsI_k(l)
to always make sure we stay within one of them.
- Use this to reimplement
lookahead(self)
implementing the above, with a guaranteed result and good complexity. And if at all possible a constant time complexity when the constraints are simple (e.g.floor=0
, ...).
This method should support
_current_list=[]
as well.
- Update
next
accordingly (the logic should be slightly simpler thanks to the handling of the empty prefix).
- If not implemented elsewhere: let the constructor accept a prefix as
input, to start the iteration from there. Derive a
next
method as a side effect.
- Remove
integer_list_old
at the final stage.
Change History (10)
comment:1 Changed 4 years ago by
- Description modified (diff)
comment:2 Changed 4 years ago by
- Description modified (diff)
comment:3 Changed 4 years ago by
- Description modified (diff)
comment:4 Changed 4 years ago by
- Description modified (diff)
comment:5 follow-up: ↓ 6 Changed 4 years ago by
- Description modified (diff)
comment:6 in reply to: ↑ 5 Changed 4 years ago by
Replying to jdemeyer:
Can we please stop talking about "degenerate cases" without specifying what that means?
This ticket involves some research, which includes in particular defining precisely what "degenerate" is. Until this research is done, we can't do anything but remain vague.
comment:7 Changed 4 years ago by
- Description modified (diff)
comment:8 Changed 4 years ago by
- Dependencies set to #18181
comment:9 Changed 4 years ago by
- Dependencies changed from #18181 to #18109
comment:10 Changed 4 years ago by
- Description modified (diff)
Can we please stop talking about "degenerate cases" without specifying what that means?