Opened 11 years ago
Closed 11 years ago
#5478 closed defect (fixed)
[with patch, positive review] RestrictedPartitions is very slow and a huge memory hog
Reported by: | ddrake | Owned by: | ddrake |
---|---|---|---|
Priority: | major | Milestone: | sage-3.4.1 |
Component: | combinatorics | Keywords: | combinat, partitions, gap |
Cc: | sage-combinat | Merged in: | |
Authors: | Reviewers: | ||
Report Upstream: | Work issues: | ||
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
I have code that does the same as
RestrictedPartitions(3000, [1000, 500, 100, 50, 10])
but about 5.5 times faster...and on my system,
RestrictedPartitions(5000, [1000, 500, 100, 50, 10])
uses over two gigabytes of memory, whereas my code takes about 9.2 seconds with minimal memory usage.
I need to fiddle with my code so that it's a drop-in replacement for RestrictedPartitions?, but I should have a patch up soon.
Attachments (3)
Change History (18)
comment:1 Changed 11 years ago by
comment:2 Changed 11 years ago by
For ease of reviewing, I plan to have two patches here: the first one includes deprecation for RestrictedPartitions? and fixes up some problems with the Partitions docstring. The second will actually implement the new Partitions(..., parts_in=...)
code.
comment:3 Changed 11 years ago by
- Summary changed from RestrictedPartitions is very slow and a huge memory hog to [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog
The second patch gets doctests working and adds the new functionality. I'm not entirely sure I did the deprecation stuff correctly; I added a deprecation warning, and then fiddled with all the doctests in the old RestrictedPartitions? code so that it passes doctests.
comment:4 follow-up: ↓ 9 Changed 11 years ago by
A little ego update to the patch -- I added myself to the authors list. Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.
comment:5 Changed 11 years ago by
Some benchmarks (all on sage.math):
BEFORE sage: get_memory_usage() 695.7421875 sage: ps = RestrictedPartitions(100, ([1,6..100] + [4,9..100])) sage: %time sum(1 for p in ps) CPU times: user 26.89 s, sys: 1.11 s, total: 28.00 s Wall time: 28.69 s 74040 sage: get_memory_usage() 1781.41796875
AFTER sage: get_memory_usage() 699.828125 sage: ps = Partitions(100, parts_in=([1,6..100] + [4,9..100])) sage: %time sum(1 for p in ps) CPU times: user 4.96 s, sys: 0.02 s, total: 4.98 s Wall time: 4.98 s 74040 sage: get_memory_usage() 699.828125
The above example which prompted this ticket:
BEFORE sage: get_memory_usage() 695.73828125 sage: ps = RestrictedPartitions(3000, [10,50,100,500,1000]) sage: %time sum(1 for p in ps) CPU times: user 5.69 s, sys: 0.21 s, total: 5.90 s Wall time: 6.05 s 3506 sage: get_memory_usage() 935.48046875
AFTER sage: get_memory_usage() 699.82421875 sage: ps = Partitions(3000, parts_in=[10,50,100,500,1000]) sage: %time sum(1 for p in ps) CPU times: user 1.08 s, sys: 0.00 s, total: 1.08 s Wall time: 1.09 s 3506 sage: get_memory_usage() 699.82421875
comment:6 Changed 11 years ago by
- Summary changed from [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog to [with patch, needs minor work+rebase] RestrictedPartitions is very slow and a huge memory hog
Patch look good. However, before finishing my review I'm waiting it to be rebased on top of 3.4.1. Also some interface problem have been discussed by e-mail: It was decided more that one month ago that a combinatorial class should define:
__iter__
instead ofiterator
.cardinality
instead ofcount
orlen
.
Florent
comment:7 Changed 11 years ago by
- Summary changed from [with patch, needs minor work+rebase] RestrictedPartitions is very slow and a huge memory hog to [with patch, needs work] RestrictedPartitions is very slow and a huge memory hog
Please keep the subject standard so that the proper reports are picking up the right tickets.
Cheers,
Michael
comment:8 follow-up: ↓ 10 Changed 11 years ago by
- Summary changed from [with patch, needs work] RestrictedPartitions is very slow and a huge memory hog to [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog
I've updated the patches to apply on top of 3.4.1.rc1.
comment:9 in reply to: ↑ 4 Changed 11 years ago by
Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.
This script is great. I should be put in sage in one place or one other. If someone tries to optimize your code (eg: Cythonize), he surely will be happy to have your test code. Why not shipping it into the patch with a more explicit name and a one-line example on how to use it with the comment " #longtest " ?
Florent
comment:10 in reply to: ↑ 8 ; follow-up: ↓ 11 Changed 11 years ago by
Some comment on the first patch:
I'm not a native English speaker but it seems to me that
If
starting=p
is passed, then the combinatorial class of partitions greater than or equal to
p
in lexicographic order is returned.
is clearer if phrased
If
starting=p
is passed, then the combinatorial class of partitions with part greater than or equal to
p
in lexicographic order is returned.
In the same way
If
ending=p
is passed, then the combinatorial class of partitions at most
p
in lexicographic order is returned.
is better phrased
If
ending=p
is passed, then the combinatorial class of partitions with part at most
p
in lexicographic order is returned.
Otherwise the rest of the first patch looks good. I'm working on the second one.
Florent
comment:11 in reply to: ↑ 10 Changed 11 years ago by
Replying to hivert:
Some comment on the first patch:
I'm not a native English speaker but it seems to me that
If
starting=p
is passed, then the combinatorial class of partitions greater than or equal to
p
in lexicographic order is returned.is clearer if phrased
If
starting=p
is passed, then the combinatorial class of partitions with part greater than or equal to
p
in lexicographic order is returned.
"p" refers to a partition, not to a part, so the text is okay. Perhaps we should make that more clear, though. I plan on opening a ticket to improve the documentation in partition.py, so we can do that then.
Replying to hivert:
Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.
This script is great. I should be put in sage in one place or one other. If someone tries to optimize your code (eg: Cythonize), he surely will be happy to have your test code. Why not shipping it into the patch with a more explicit name and a one-line example on how to use it with the comment " #longtest " ?
Well, right now, the script only halts if it finds an error, so it would be a really, really long test. :)
The tests also can use lots of memory, since it puts a list of the partitions into memory. But if anyone thinks (a mildly modified version of) the script should get run on a "long" test, I'm happy to have it included.
comment:12 Changed 11 years ago by
- Cc sage-combinat added
comment:13 Changed 11 years ago by
- Summary changed from [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog to [with patch, positive review] RestrictedPartitions is very slow and a huge memory hog
The patch looks good. The doctests passed except that depending on the base the doctests numbers may differ. So I added a review patch which replace the explicit doctest numbers with ...
. This should solve this problem.
I'm giving a positive review... Though maybe someone should reread my trivial review patch.
Concerning the script it is quite redundant with the .check() method we'll have with categories.
Florent
comment:14 Changed 11 years ago by
Thanks for fixing up those line numbers in the doctests...I struggled a bit to get those right, and had forgotten about the "..." stuff.
comment:15 Changed 11 years ago by
- Milestone changed from sage-combinat to sage-3.4.1
- Resolution set to fixed
- Status changed from new to closed
Merged all three patches in Sage 3.4.1.rc3.
Cheers,
Michael
Maybe I should add that I'm interested, of course, in *iterating* through all those partitions. Just running "
RestrictedPartitions(3000, [1000, 500, 100, 50, 10])
" is very fast because it just returns an iterator. The actual code I'm running is stuff like