Opened 11 years ago

Closed 11 years ago

#5478 closed defect (fixed)

[with patch, positive review] RestrictedPartitions is very slow and a huge memory hog

Reported by: ddrake Owned by: ddrake
Priority: major Milestone: sage-3.4.1
Component: combinatorics Keywords: combinat, partitions, gap
Cc: sage-combinat Merged in:
Authors: Reviewers:
Report Upstream: Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

I have code that does the same as

RestrictedPartitions(3000, [1000, 500, 100, 50, 10])

but about 5.5 times faster...and on my system,

RestrictedPartitions(5000, [1000, 500, 100, 50, 10])

uses over two gigabytes of memory, whereas my code takes about 9.2 seconds with minimal memory usage.

I need to fiddle with my code so that it's a drop-in replacement for RestrictedPartitions?, but I should have a patch up soon.

Attachments (3)

trac_5478-1.patch (8.2 KB) - added by ddrake 11 years ago.
deprecate RestrictedPartitions? and fix up some documentation
trac_5478-2.patch (11.6 KB) - added by ddrake 11 years ago.
implement new Partitions class with parts_in keyword
trac_5478-review.patch (3.6 KB) - added by hivert 11 years ago.
Improved doctests for deprecation warnings.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 11 years ago by ddrake

Maybe I should add that I'm interested, of course, in *iterating* through all those partitions. Just running "RestrictedPartitions(3000, [1000, 500, 100, 50, 10])" is very fast because it just returns an iterator. The actual code I'm running is stuff like

sum(1 for p in RestrictedPartitions(...))

comment:2 Changed 11 years ago by ddrake

For ease of reviewing, I plan to have two patches here: the first one includes deprecation for RestrictedPartitions? and fixes up some problems with the Partitions docstring. The second will actually implement the new Partitions(..., parts_in=...) code.

comment:3 Changed 11 years ago by ddrake

  • Summary changed from RestrictedPartitions is very slow and a huge memory hog to [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog

The second patch gets doctests working and adds the new functionality. I'm not entirely sure I did the deprecation stuff correctly; I added a deprecation warning, and then fiddled with all the doctests in the old RestrictedPartitions? code so that it passes doctests.

comment:4 follow-up: Changed 11 years ago by ddrake

A little ego update to the patch -- I added myself to the authors list. Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.

comment:5 Changed 11 years ago by ddrake

Some benchmarks (all on sage.math):

BEFORE
sage: get_memory_usage()
695.7421875
sage: ps = RestrictedPartitions(100, ([1,6..100] + [4,9..100]))
sage: %time sum(1 for p in ps)
CPU times: user 26.89 s, sys: 1.11 s, total: 28.00 s
Wall time: 28.69 s
74040
sage: get_memory_usage()
1781.41796875
AFTER
sage: get_memory_usage()
699.828125
sage: ps = Partitions(100, parts_in=([1,6..100] + [4,9..100]))
sage: %time sum(1 for p in ps)
CPU times: user 4.96 s, sys: 0.02 s, total: 4.98 s
Wall time: 4.98 s
74040
sage: get_memory_usage()
699.828125

The above example which prompted this ticket:

BEFORE
sage: get_memory_usage()
695.73828125
sage: ps = RestrictedPartitions(3000, [10,50,100,500,1000])
sage: %time sum(1 for p in ps)
CPU times: user 5.69 s, sys: 0.21 s, total: 5.90 s
Wall time: 6.05 s
3506
sage: get_memory_usage()
935.48046875
AFTER
sage: get_memory_usage()
699.82421875
sage: ps = Partitions(3000, parts_in=[10,50,100,500,1000])
sage: %time sum(1 for p in ps)
CPU times: user 1.08 s, sys: 0.00 s, total: 1.08 s
Wall time: 1.09 s
3506
sage: get_memory_usage()
699.82421875

comment:6 Changed 11 years ago by hivert

  • Summary changed from [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog to [with patch, needs minor work+rebase] RestrictedPartitions is very slow and a huge memory hog

Patch look good. However, before finishing my review I'm waiting it to be rebased on top of 3.4.1. Also some interface problem have been discussed by e-mail: It was decided more that one month ago that a combinatorial class should define:

  • __iter__ instead of iterator.
  • cardinality instead of count or len.

Florent

comment:7 Changed 11 years ago by mabshoff

  • Summary changed from [with patch, needs minor work+rebase] RestrictedPartitions is very slow and a huge memory hog to [with patch, needs work] RestrictedPartitions is very slow and a huge memory hog

Please keep the subject standard so that the proper reports are picking up the right tickets.

Cheers,

Michael

Changed 11 years ago by ddrake

deprecate RestrictedPartitions? and fix up some documentation

Changed 11 years ago by ddrake

implement new Partitions class with parts_in keyword

comment:8 follow-up: Changed 11 years ago by ddrake

  • Summary changed from [with patch, needs work] RestrictedPartitions is very slow and a huge memory hog to [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog

I've updated the patches to apply on top of 3.4.1.rc1.

comment:9 in reply to: ↑ 4 Changed 11 years ago by hivert

Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.

This script is great. I should be put in sage in one place or one other. If someone tries to optimize your code (eg: Cythonize), he surely will be happy to have your test code. Why not shipping it into the patch with a more explicit name and a one-line example on how to use it with the comment " #longtest " ?

Florent

comment:10 in reply to: ↑ 8 ; follow-up: Changed 11 years ago by hivert

Some comment on the first patch:

I'm not a native English speaker but it seems to me that

If starting=p is passed, then the combinatorial class of partitions greater than or equal to p in lexicographic order is returned.

is clearer if phrased

If starting=p is passed, then the combinatorial class of partitions with part greater than or equal to p in lexicographic order is returned.

In the same way

If ending=p is passed, then the combinatorial class of partitions at most p in lexicographic order is returned.

is better phrased

If ending=p is passed, then the combinatorial class of partitions with part at most p in lexicographic order is returned.

Otherwise the rest of the first patch looks good. I'm working on the second one.

Florent

comment:11 in reply to: ↑ 10 Changed 11 years ago by ddrake

Replying to hivert:

Some comment on the first patch:

I'm not a native English speaker but it seems to me that

If starting=p is passed, then the combinatorial class of partitions greater than or equal to p in lexicographic order is returned.

is clearer if phrased

If starting=p is passed, then the combinatorial class of partitions with part greater than or equal to p in lexicographic order is returned.

"p" refers to a partition, not to a part, so the text is okay. Perhaps we should make that more clear, though. I plan on opening a ticket to improve the documentation in partition.py, so we can do that then.

Replying to hivert:

Anyone reviewing this might want to look at my test script: restricted_partitions_test_suite.sage. I've run over 10,000 successful tests with it.

This script is great. I should be put in sage in one place or one other. If someone tries to optimize your code (eg: Cythonize), he surely will be happy to have your test code. Why not shipping it into the patch with a more explicit name and a one-line example on how to use it with the comment " #longtest " ?

Well, right now, the script only halts if it finds an error, so it would be a really, really long test. :)

The tests also can use lots of memory, since it puts a list of the partitions into memory. But if anyone thinks (a mildly modified version of) the script should get run on a "long" test, I'm happy to have it included.

comment:12 Changed 11 years ago by ddrake

  • Cc sage-combinat added

Changed 11 years ago by hivert

Improved doctests for deprecation warnings.

comment:13 Changed 11 years ago by hivert

  • Summary changed from [with patch, needs review] RestrictedPartitions is very slow and a huge memory hog to [with patch, positive review] RestrictedPartitions is very slow and a huge memory hog

The patch looks good. The doctests passed except that depending on the base the doctests numbers may differ. So I added a review patch which replace the explicit doctest numbers with .... This should solve this problem.

I'm giving a positive review... Though maybe someone should reread my trivial review patch.

Concerning the script it is quite redundant with the .check() method we'll have with categories.

Florent

comment:14 Changed 11 years ago by ddrake

Thanks for fixing up those line numbers in the doctests...I struggled a bit to get those right, and had forgotten about the "..." stuff.

comment:15 Changed 11 years ago by mabshoff

  • Milestone changed from sage-combinat to sage-3.4.1
  • Resolution set to fixed
  • Status changed from new to closed

Merged all three patches in Sage 3.4.1.rc3.

Cheers,

Michael

Note: See TracTickets for help on using tickets.