Ticket #10534 (needs_review enhancement)
Optimizations for the generation of subwords, subsets and set partitions
| Reported by: | vdelecroix | Owned by: | vdelecroix |
|---|---|---|---|
| Priority: | major | Milestone: | sage-5.10 |
| Component: | combinatorics | Keywords: | generation, combinatorics, set, subset, partition |
| Cc: | sage-combinat | Work issues: | |
| Report Upstream: | N/A | Reviewers: | Florent Hivert |
| Authors: | Vincent Delecroix | Merged in: | |
| Dependencies: | Stopgaps: |
Description (last modified by chapoton) (diff)
This ticket stands for a faster method of exhaustive generation of combinatorial object that may be found in sage.combinat.* and more precisely
- subwords (sage.combinat.subword)
- subsets (sage.combinat.subset)
- set partitions (sage.combinat.set_partitions and sage.combinat.set_partition_ordered)
Most of the time, the improvment just consists in calling the python library itertools. For other, it consists in creating a lower level Python iterator which does not use the heavy mechanism of Parent/Element? (which is not intended for the frontend user).
Moreover, the patch provides
- bugs and typos corrections (doctest coverage is now 100%)
- random generation capabilities
- modified cardinality method which always return a Sage Integer (and no more Python int or Sage Rational)
Some timings (on Intel(R) Core(TM)2 Duo CPU T8100 @ 2.10GHz)
Old version
sage: S = SetPartitions(8,[3,3,2])
sage: timeit('for p in S: pass')
5 loops, best of 3: 401 ms per loop
sage: S = SetPartitions(9,[3,3,2,1])
sage: timeit('for p in S: pass')
5 loops, best of 3: 4 s per loop
sage: S = Subsets([1]*1+[2]*2+[3]*3+[4]*4+[5]*5, submultiset=True)
sage: timeit('S.cardinality()')
5 loops, best of 3: 712 ms per loop
New version
sage: S = SetPartitions(8,[3,3,2])
sage: timeit('for p in S: pass')
5 loops, best of 3: 130 ms per loop
sage: timeit('for p in S._fast_iterator(): pass')
25 loops, best of 3: 10.4 ms per loop
sage: S = SetPartitions(9,[3,3,2,1])
sage: timeit('for p in S: pass')
5 loops, best of 3: 1.43 s per loop
sage: timeit('for p in S._fast_iterator(): pass')
5 loops, best of 3: 91.3 ms per loop
sage: S = Subsets([1]*1+[2]*2+[3]*3+[4]*4+[5]*5,submultiset=True)
sage: timeit('S.cardinality()')
625 loops, best of 3: 33.9 µs per loop
Apply only:
- trac_10534-generation_of_subsets_and_set_partitions.3.patch
Attachments
Change History
comment:1 Changed 2 years ago by vdelecroix
- Description modified (diff)
- Summary changed from Generation of subsets and partitions enumerated sets to Generation of subwords, subsets and set partitions
comment:2 Changed 2 years ago by vdelecroix
- Status changed from new to needs_review
- Description modified (diff)
comment:4 Changed 2 years ago by vdelecroix
- Status changed from needs_review to needs_work
- Description modified (diff)
comment:9 follow-up: ↓ 10 Changed 2 years ago by hivert
- Reviewers set to Florent Hivert
- Authors changed from vdelecroix to Vincent Delecroix
Hi Vincent,
Your patch looks good except a few remarks:
- the field Author below is for credits. You should put your real name and not your login.
- the documentation now says:
[...] The elements of the output are tuples of Python int (and not Sage Integer). [...] It's element are returned as plain list of python int.
At least one of those two sentences is wrong. Also they seems a little redundent. This should be fixed.
I'm sorry for not having the time to put a review patch, nor finish the review now.
comment:10 in reply to: ↑ 9 Changed 2 years ago by vdelecroix
- the field Author below is for credits. You should put your real name and not your login.
Thank you for modifying this.
- the documentation now says:
[...] The elements of the output are tuples of Python int (and not Sage Integer). [...] It's element are returned as plain list of python int.
That's right. Output value are now tuples and it is fully specified.
Changed 2 years ago by vdelecroix
-
attachment
trac_10534-generation_of_subsets_and_set_partitions.patch
added
comment:11 follow-up: ↓ 12 Changed 2 years ago by hivert
Hi Vincent,
I need more time to review your long patch. I just notice the following:
file: sage/combinat/choose_nk.py 85 sage: [0,2] in c52 # trac XXXX
You probably want to replace XXXX by an actual number. I'll continue my review tomorrow (hopefully)
Florent
comment:12 in reply to: ↑ 11 Changed 2 years ago by vdelecroix
Hi Florent
I need more time to review your long patch.
Certainly. Thanks for the review.
I have three important design questions relative to that patch and the Combinatorial classes which it modify
- should the attributes of a CombinatorialClass? be visible like S.w or do we prefer S._w (some code in graphs uses a direct call to S.w) ?
- what should be the inheritance/relations between say Subsets (subset of a given set) and Subsets_k (subset of size k of a given set) ?
- what should be the output of Subwords ? tuple, list, strings, any user defined type ? Actually, if you initialize it with list (resp. tuple, string) it output lists (resp. tuple, string).
I just notice the following:
file: sage/combinat/choose_nk.py 85 sage: [0,2] in c52 # trac XXXXYou probably want to replace XXXX by an actual number.
Done
comment:14 Changed 13 months ago by vdelecroix
- Description modified (diff)
Nathan, you should be more explicit when asking for more info...
comment:15 Changed 13 months ago by vdelecroix
- Status changed from needs_info to needs_review
- Description modified (diff)
comment:16 Changed 13 months ago by nthiery
- Summary changed from Generation of subwords, subsets and set partitions to Optimizations for the generation of subwords, subsets and set partitions
comment:17 Changed 12 months ago by vdelecroix
I update the file to make it works on sage-5.0. I will update the corresponding patch in the combinat queue.
comment:18 Changed 10 months ago by chapoton
This code breaks a test in polytopes.parallelotope, because this last procedure calls Combinations and expect to receive a list and not a tuple. This has to be cured, maybe by inserting a list() at the appropriate place in this procedure. Please test that the correction works before uploading a new patch.
Please also remove all trailing whitespaces, see the bot report.
comment:19 follow-up: ↓ 20 Changed 10 months ago by chapoton
Here is a minimal example of what goes wrong in polytopes.parallelotope :
sage: g=[ vector([1,0]), vector([0,1])] sage: Combinations(g)
This is a mutability/hash problem, when trying to convert a multiset in a set, apparently..
comment:20 in reply to: ↑ 19 Changed 10 months ago by vdelecroix
Replying to chapoton:
Here is a minimal example of what goes wrong in polytopes.parallelotope :
sage: g=[ vector([1,0]), vector([0,1])] sage: Combinations(g)This is a mutability/hash problem, when trying to convert a multiset in a set, apparently..
In the above example, the command fails because the *vector* is not hashable.
Changed 10 months ago by vdelecroix
-
attachment
trac_10534-generation_of_subsets_and_set_partitions.2.patch
added
comment:21 Changed 10 months ago by vdelecroix
I correct the error and add the ticket number on the first line of the patch (as asked by the patchbot).
comment:22 Changed 9 months ago by chapoton
- Description modified (diff)
Apply only:
- trac_10534-generation_of_subsets_and_set_partitions.3.patch
Changed 9 months ago by chapoton
-
attachment
trac_10534-generation_of_subsets_and_set_partitions.3.patch
added
comment:23 Changed 9 months ago by chapoton
new patch, to ensure 100% doctest
comment:24 follow-up: ↓ 25 Changed 7 months ago by emchennyc
I got 100% doctest coverage for all the edited files but there was one warning. If someone doesn't mind explaining what is the significance of this, that would be helpful.
SCORE /Applications/sage/devel/sage-main/sage/combinat/subset.py: 100% (33 of 33)
Possibly wrong (function name doesn't occur in doctests):
- _element_constructor_(self, x):
- _element_constructor_(self, x):
Also, "devel/sage/sage/tests/interrupt.pyx" timed out when I ran the long doctests.
comment:25 in reply to: ↑ 24 Changed 7 months ago by vdelecroix
SCORE /Applications/sage/devel/sage-main/sage/combinat/subset.py: 100% (33 of 33)
Possibly wrong (function name doesn't occur in doctests):
- _element_constructor_(self, x):
- _element_constructor_(self, x):
This is because the methods _element_constructor_(self, x) are not directly tested. There should be a comment "# indirect doctest" after the corresponding test. Basically, this function is used to build an element of a parent, but it is called through the method call of the parent (which is implemented in Parent).
Also, "devel/sage/sage/tests/interrupt.pyx" timed out when I ran the long doctests.
This is much more ennoying and I do not know why.
comment:26 Changed 2 months ago by ncohen
Helloooooooooooooooo !!!
While I read your patch, two things :
- an "It's" should become an "its"
- why do you change the ouput type from lists to tuples ?
Btw, sorry I did not answer your comment many months ago, I may have missed the warning from trac : I had set the ticket to needs_info because you seemed to be asking rather fundamental questions which would have had an impact on the patch, so that it would have been a pity to review the patch if you intended to change it so much later.
Have you had an answer from Florent since ?
Nathann
