Opened 6 years ago

Closed 6 years ago

#20008 closed enhancement (fixed)

Implement non-recursive iterator for compositions

Reported by: tscrim Owned by: sage-combinat
Priority: major Milestone: sage-7.1
Component: combinatorics Keywords: iterator
Cc: sage-combinat, darij, chapoton, nthiery, vdelecroix Merged in:
Authors: Travis Scrimshaw Reviewers: Darij Grinberg
Report Upstream: N/A Work issues:
Branch: c6ba3e7 (Commits, GitHub, GitLab) Commit: c6ba3e7f4d3414877fdce12128d36b40bd5e7834
Dependencies: Stopgaps:

Status badges

Description

Compositions currently uses a recursive iterator, which also has extra overhead of converting between the element classes.

Change History (12)

comment:1 Changed 6 years ago by tscrim

  • Branch set to public/combinat/composition_iterator-20008
  • Commit set to 2505dbcaf6b6711b3ef43c3d7e577bcd162d8447
  • Status changed from new to needs_review

comment:2 Changed 6 years ago by git

  • Commit changed from 2505dbcaf6b6711b3ef43c3d7e577bcd162d8447 to a796f0eb320c7955d2007486fefd2407a9bb7345

Branch pushed to git repo; I updated commit sha1. New commits:

a796f0eImplement non-recursive iterator for compositions.

comment:3 Changed 6 years ago by tscrim

Some timings with the current branch:

sage: C = Compositions(5)
sage: %timeit for x in C: pass
The slowest run took 8.30 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 34.9 µs per loop
sage: C = Compositions(10)
sage: %timeit for x in C: pass
1000 loops, best of 3: 1.1 ms per loop
sage: C = Compositions(20)
sage: %timeit for x in C: pass
1 loops, best of 3: 1.18 s per loop

versus the old version:

sage: C = Compositions(5)
sage: %timeit for x in C: pass
The slowest run took 10.67 times longer than the fastest. This could mean that an intermediate result is being cached 
1000 loops, best of 3: 187 µs per loop
sage: C = Compositions(10)
sage: %timeit for x in C: pass
100 loops, best of 3: 9.11 ms per loop
sage: C = Compositions(20)
sage: %timeit for x in C: pass
1 loops, best of 3: 15.7 s per loop

Also, the old version instantly fails for large n because of the recursion:

sage: C = Compositions(1000)
sage: for x in C: pass
...
RuntimeError: maximum recursion depth exceeded in cmp

comment:4 Changed 6 years ago by tscrim

  • Cc chapoton nthiery added

comment:5 follow-up: Changed 6 years ago by darij

Grammar error in doc.

More importantly, though, I'm not sure I understand the algorithm. Why does it not decrement s in the s > n case? What is the logic behind the algorithm anyway?

comment:6 in reply to: ↑ 5 ; follow-up: Changed 6 years ago by tscrim

Replying to darij:

Grammar error in doc.

More importantly, though, I'm not sure I understand the algorithm. Why does it not decrement s in the s > n case?

Actually, that case never happens unless n < 0. So that could be removed.

What is the logic behind the algorithm anyway?

The idea is to find the lex smallest composition by adding a box each step. If we have a composition of n, then yield and remove the last column and add a box to the new final column. Otherwise add a box to the end. Hmmm....implementing it directly from that would probably be a little cleaner since cur would always be a valid partition. Let me change that now.

comment:7 Changed 6 years ago by git

  • Commit changed from a796f0eb320c7955d2007486fefd2407a9bb7345 to c6ba3e7f4d3414877fdce12128d36b40bd5e7834

Branch pushed to git repo; I updated commit sha1. New commits:

c6ba3e7Some cleanup to the composition iterator.

comment:8 in reply to: ↑ 6 Changed 6 years ago by tscrim

Replying to tscrim:

Replying to darij:

Grammar error in doc.

How about this?

What is the logic behind the algorithm anyway?

The idea is to find the lex smallest composition by adding a box each step. If we have a composition of n, then yield and remove the last column and add a box to the new final column. Otherwise add a box to the end. Hmmm....implementing it directly from that would probably be a little cleaner since cur would always be a valid partition. Let me change that now.

Actually, now I remember why I did it how it is, so I wouldn't have to check that cur is non-empty when I did my backtracking.

comment:9 Changed 6 years ago by darij

  • Reviewers set to Darij Grinberg

LGTM! This is actually a nice lexicographic iterator over all compositions of size \leq n.

If you doctest sage/combinat (this iterator is used a lot, and who knows what could happen), feel free to set this to positive review.

comment:10 Changed 6 years ago by nthiery

  • Cc vdelecroix added

Hi!

Thanks Travis; the timing is indeed way better.

Vincent, what do you think? This looks like a step in our general direction of splitting:

  • a lowlevel library of super-fast itertools style iterators producing plain data structures
  • higher level layer views, with all the Parent, Element goodies.

I haven't followed much the progress in this direction. But I am wondering whether we want to go further (here or in a later ticket) by moving the iterator in a separate Cython file possibly using C-level data structure like vector<int> or ClonableIntArray?.

The other thing is that the algorithm here is exactly that of IntegerListLex?, with in principle, the same complexity. Of course the later has overhead since it needs to handle all the additional constraints. Yet it would reduce code bloat to focus on optimizing/cythonizing IntegerListLex? rather than having a bunch of non cythonized (by lack of manpower) python methods for special cases.

Cheers,

comment:11 Changed 6 years ago by tscrim

  • Status changed from needs_review to positive_review

I'm wondering how much we could get by going one step further and moving IntegerListsLex to a small standalone C/C++ library, where we can really get to low-level optimizations. Although I see some technical challenges with this; in particular, taking functions as input and the lambda functions involved with the current implementation. Perhaps Cython will be sufficient.

I agree that this would further benefit from cythonization/C-ifying. Right now we don't have a set place upon where we can put these functions. I also do not see them as code bloat because to really optimize IntegerListsLex, we will almost certainly have specialized implementations for special cases to avoid overhead of the general case. (This implementation can be easily tweaked to cover a number of other cases, but would likely have bad performance in general.)

However, I think we should do all of these things on a separate ticket(s), which is why I'm setting this to a positive review given Darij's comment and the green patchbot. Feel free to set it back if you disagree. Yet +1 to adding a common place/library for all of these iterator functions (and their cython/C versions).

comment:12 Changed 6 years ago by vbraun

  • Branch changed from public/combinat/composition_iterator-20008 to c6ba3e7f4d3414877fdce12128d36b40bd5e7834
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.