Opened 4 years ago
Closed 4 years ago
#25659 closed enhancement (fixed)
make BrauerAlgebra faster
Reported by: | mantepse | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.4 |
Component: | combinatorics | Keywords: | |
Cc: | alauve, tscrim, zabrocki | Merged in: | |
Authors: | Martin Rubey, Travis Scrimshaw | Reviewers: | Travis Scrimshaw, Martin Rubey |
Report Upstream: | N/A | Work issues: | |
Branch: | ef60737 (Commits, GitHub, GitLab) | Commit: | ef607379694c9394a7d4b90b95aadc6db3782830 |
Dependencies: | #25462 #26111 | Stopgaps: |
Description (last modified by )
Using the fast iterator from perfect matchings, we get a nice speedup for the Brauer algebra:
sage: %timeit len([SetPartition(p) for p in da.brauer_diagrams(6)]) 1 loop, best of 3: 6.69 s per loop
sage: %timeit len([SetPartition(p) for p in da.brauer_diagrams(6)]) 1 loop, best of 3: 180 ms per loop
Change History (46)
comment:1 Changed 4 years ago by
- Branch set to u/mantepse/make_braueralgebra_faster
comment:2 Changed 4 years ago by
- Cc alauve tscrim zabrocki added
- Commit set to 626ce17a078add5bbdcd639725d5485dd5366875
- Component changed from PLEASE CHANGE to combinatorics
- Description modified (diff)
- Status changed from new to needs_review
- Type changed from PLEASE CHANGE to enhancement
comment:3 Changed 4 years ago by
- Dependencies set to #25462
comment:4 follow-up: ↓ 5 Changed 4 years ago by
Quick comments (I will comment on #25462 either later this week or next week since I am at FPSAC this week):
- It seems like the
iter_aux
should be pulled out as a separate method (maybe even function?) ofPerfectMatchings
and called directly by the Brauer diagrams iterator (it makes no sense to create the temporary element object).
- When checking going to the orbit basis, if you are not going to use
an_element
(which I still recommend using as it should not be machine dependent), then at least use a sum of 2 or more basis elements with at least one coefficient as that is a better test.
- Bikeshedding again on the condensed output (as before, you can ignore, but I still will mention it):
sage: [SetPartition(p) for p in da.brauer_diagrams(5/2)] - [{{-3, 3}, {-2, 1}, {-1, 2}}, {{-3, 3}, {-2, 2}, {-1, 1}}, {{-3, 3}, {-2, -1}, {1, 2}}] + [{{-3, 3}, {-2, -1}, {1, 2}}, + {{-3, 3}, {-2, 2}, {-1, 1}}, + {{-3, 3}, {-2, 1}, {-1, 2}}]
Otherwise LGTM (modulo #25462) and is quite a speedup. I do wonder if _iterator_part
did not return Set
objects, what the speed difference would be, but we can address that on a later ticket.
comment:5 in reply to: ↑ 4 Changed 4 years ago by
- It seems like the
iter_aux
should be pulled out as a separate method (maybe even function?) ofPerfectMatchings
and called directly by the Brauer diagrams iterator (it makes no sense to create the temporary element object).
as for #25462: please provide a name, so there is at least some uniformity across sage. And again, since it is not unlikely that one does several computations in the same (Brauer) algebra, it might make more sense to use .list()
, which caches the result.
comment:6 Changed 4 years ago by
I would probably call it _iterator
, but its hidden. So as long as it locally makes sense, I think it is fine. Currently, we don't really have many such functions/methods, so there is not really a danger of non-uniformity.
comment:7 Changed 4 years ago by
If they are using the same Brauer algebra, you should cache the result of basis()
. I don't see what the iterator returning the raw Python objects (i.e., not an instance of the Element
class) has to do with caching via .list()
.
comment:8 Changed 4 years ago by
- Commit changed from 626ce17a078add5bbdcd639725d5485dd5366875 to 3d5f553675b18818631854380275c952cd20adba
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
91e8d5c | Merge commit '9c5298d' into public/25462
|
ce2962b | restore richer doc tests
|
ba08ff3 | reviewer's comments
|
95ce20f | provide iterators which return lists of lists
|
d2e0e6e | inline a computation by reviewer's request
|
947233c | Merge branch 'public/25462' of git://trac.sagemath.org/sage into public/combinat/speedup_set_partitions-25462
|
a79e302 | Reverted to an_element() and added some additional reviewer changes.
|
87bf535 | Cythonizing iterator.
|
ba6a115 | Fixing doctests due to ordering change.
|
3d5f553 | Merge branch 'public/combinat/speedup_set_partitions-25462' of git://trac.sagemath.org/sage into t/25659/make_braueralgebra_faster
|
comment:9 Changed 4 years ago by
I merged and rebased, ready for review! (after this ticket, I'll do #25662)
comment:10 Changed 4 years ago by
You should compare this against #25462. Another comparison should be done potentially replacing iteration over SetPartitions(X)
with the dedicated iterator that does not create full elements of SetPartitions(X)
(I probably should have done this swap on #25462...).
From looking at the PerfectMatchings.__iter__
, that looks like it could be make into a simple backtracking algorithm and be really ameable to cythonization as an iterator object (so a pure cdef
implementation).
comment:11 Changed 4 years ago by
The iterator for perfect matchings is almost certainly not best possible (given python's aversion against recursion). Do you need a better one? It might be good to compare a cythonized version with https://stackoverflow.com/a/13020502/4680581
comment:12 Changed 4 years ago by
I don't have a personal use for it, but I do like making things in Sage fast so it can (continue to) be the dominate software in combinatorics. So I tried the algorithm in that SO post, and it is basically the same time (both in Python). Those pop
calls take up somewhere between 1/3 to 1/2 of the time.
So we come to a small crossroads. Do we implement it as a SearchForest
and use the recursive-but-embarrassingly-parallelizable algorithm or do we make the current one in serial but fully cythonized?
comment:13 Changed 4 years ago by
Actually, probably the latter as for the former, that would only be useful if we wanted to do something on all such objects (and output order didn't matter).
comment:14 Changed 4 years ago by
- Branch changed from u/mantepse/make_braueralgebra_faster to public/combinat/speedup_brauer_algebra-25659
- Commit changed from 3d5f553675b18818631854380275c952cd20adba to c7fdf9be32fee30ef3fb3698fcafeaf5455dd5e1
- Reviewers set to Travis Scrimshaw, Martin Rubey
Here is where I am currently at:
sage: %time for x in PerfectMatchings(10): pass CPU times: user 4 ms, sys: 0 ns, total: 4 ms Wall time: 964 µs sage: %time for x in PerfectMatchings(16): pass CPU times: user 796 ms, sys: 28 ms, total: 824 ms Wall time: 794 ms sage: %time for x in PerfectMatchings(20): pass CPU times: user 3min 49s, sys: 8 ms, total: 3min 49s Wall time: 3min 49s
versus with your branch:
sage: %time for x in PerfectMatchings(10): pass CPU times: user 8 ms, sys: 4 ms, total: 12 ms Wall time: 7.14 ms sage: %time for x in PerfectMatchings(16): pass CPU times: user 14.4 s, sys: 0 ns, total: 14.4 s Wall time: 14.4 s
The timing for 20 would take far too long (extrapolating, somewhere between 30-90 minutes). I am pretty sure I can make this faster by better controlling what integers to look at.
BTW:
sage: PerfectMatchings(20).cardinality() 654729075
New commits:
c7fdf9b | Initial Cython version of the PerfectMatchings iterator.
|
comment:15 follow-up: ↓ 16 Changed 4 years ago by
Impressive! Is this (essentially) the same algorithm?
I was thinking about calling all these iterators iterator_raw
(or raw_iterator
), with the specification that calling the parent on them "works".
However, now they actually only work for the base set 0,1,...
, which makes even more sense, but doesn't fit this specification :-)
comment:16 in reply to: ↑ 15 Changed 4 years ago by
Replying to mantepse:
Impressive! Is this (essentially) the same algorithm?
Yes, although how I find the "next" available entry I can add is dumb (search through a list). What I want to do is make it more linked-list like so I can quickly add and remove entires. However, this requires me to be a little more smart about how I do things.
I was thinking about calling all these iterators
iterator_raw
(orraw_iterator
), with the specification that calling the parent on them "works".However, now they actually only work for the base set
0,1,...
, which makes even more sense, but doesn't fit this specification :-)
This is purely something for speed. If you want to call the parent on them, well, that's why we have the parents. ;)
comment:17 Changed 4 years ago by
It just occurred to me that the key word is "fixed-point-free involution" :-)
Algorithm 3 on page 23 of
http://www.info2.uqam.ca/~walsh_t/papers/Involutions%20paper.pdf
should be yet faster...
comment:18 Changed 4 years ago by
Here is a naive implementation of Walsh's algorithm. Warning: generate(n)
generates the matchings on 2n
points. Could you please compare? To avoid work, I did not switch the index set to 0,..,n-1
.
def generate(n): e = list(range(1,2*n+1)) f = [i+1 if i % 2 == 1 else i-1 for i in e] odd = up = done = False yield f[:] while True: i = e[0] if i == n: return if odd: x = 2*i-1 y = f[x-1] g = y-2*i up = (g % 2 == 1) else: x = i y = f[x-1] g = y - (i+1) up = (g % 2 == 0) if up: g += 1 j = y+1 else: g -= 1 j = y-1 J = f[j-1] f[y-1],f[j-1],f[x-1],f[J-1] = J, x, j, y odd = not odd e[0] = 1 if g == 0 or g == 2*(n-i): e[i-1], e[i+1-1] = e[i+1-1], i+1 yield f[:]
comment:19 Changed 4 years ago by
Here is the same thing with indices adapted and some trivial optimizations. I don't know enough cython to optimize that further. The main question is probably how to adapt it to directly produce the perfect matchings in the data structure we want (whatever that is).
def generate(int n): cdef int i, x, y, g, j, J e = list(range(2*n)) f = [i+1 if i % 2 == 0 else i-1 for i in e] odd = False yield f[:] while True: i = e[0] if i == n-1: return if odd: x = 2*i else: x = i y = f[x] g = y-x-1 if (g % 2 == odd): g += 1 j = y+1 else: g -= 1 j = y-1 J = f[j] f[y] = J; f[J] = y; f[x] = j; f[j] = x; odd = not odd e[0] = 0 if g == 0 or g == 2*(n-i-1): e[i] = e[i+1]; e[i+1] = i+1 yield f[:]
comment:20 Changed 4 years ago by
OK, I just checked: it should be fairly easy to produce restricted growth functions instead, which is nice, because then we use them consistently.
comment:21 Changed 4 years ago by
First, bad news is my iterator had a bug, so the timings of comment:14 are invalid. The good news it was easy to fix. Now I took your algorithm and added a simple conversion function. Now from testing, both algorithms are comparable in speed with the optimizations I have currently added. I am going to do some more work before pushing to see what I can do to optimize both.
comment:22 Changed 4 years ago by
- Commit changed from c7fdf9be32fee30ef3fb3698fcafeaf5455dd5e1 to 7942eab8a8188c8849fd1395b4ac1e33030edbfe
comment:23 Changed 4 years ago by
So I opted to use the fixed-point-free involution generator as I could guarantee that the result of converting to a set partition is ordered (by minimal element). Since I was getting comparable timings (most of which was actually coming from the creation of the actual elements), I figured this would give us the most speed. It also seemed easier to maintain in the long run (less low-level tricks). Here are my timings:
sage: %time for x in PerfectMatchings(10): pass CPU times: user 4 ms, sys: 0 ns, total: 4 ms Wall time: 5.84 ms sage: %time for x in PerfectMatchings(12): pass CPU times: user 60 ms, sys: 8 ms, total: 68 ms Wall time: 52.7 ms sage: %time for x in PerfectMatchings(14): pass CPU times: user 604 ms, sys: 32 ms, total: 636 ms Wall time: 574 ms sage: %time for x in PerfectMatchings(16): pass CPU times: user 8.86 s, sys: 76 ms, total: 8.94 s Wall time: 8.79 s sage: %time for x in PerfectMatchings(18): pass CPU times: user 2min 39s, sys: 32 ms, total: 2min 39s Wall time: 2min 39s sage: PerfectMatchings(18).cardinality() 34459425
comment:24 Changed 4 years ago by
- Commit changed from 7942eab8a8188c8849fd1395b4ac1e33030edbfe to 88595ed97d4195dee784012d7a4994669c8c52a8
Branch pushed to git repo; I updated commit sha1. New commits:
88595ed | Use the new backend iterators in diagram_algebras.py.
|
comment:25 Changed 4 years ago by
- Commit changed from 88595ed97d4195dee784012d7a4994669c8c52a8 to 80b13f475002464b629e1e000ddd0e0e8c234970
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
80b13f4 | Use the new backend iterators in diagram_algebras.py.
|
comment:26 Changed 4 years ago by
I am now using the backend iterators for all of the diagram algebras. So with this, if my changes are good, then positive review.
comment:27 Changed 4 years ago by
- Milestone changed from sage-8.3 to sage-8.4
comment:28 Changed 4 years ago by
Great!
I created #26065 - it would almost certainly be better to create restricted growth functions directly with Walsh's Gray code. Just in case someone is bored :-)
From your last two commits, it seems that the order of generation for the set partition iterator changed again - I don't care much, but it does seem strange.
Walsh's paper has actually appeared, here is a citation:
@article {MR1821628, AUTHOR = {Walsh, Timothy}, TITLE = {Gray codes for involutions}, JOURNAL = {J. Combin. Math. Combin. Comput.}, FJOURNAL = {Journal of Combinatorial Mathematics and Combinatorial Computing}, VOLUME = {36}, YEAR = {2001}, PAGES = {95--118}, ISSN = {0835-3026}, MRCLASS = {05A05 (68R05)}, MRNUMBER = {1821628}, MRREVIEWER = {Theodore C. Enns}, }
comment:29 Changed 4 years ago by
- Commit changed from 80b13f475002464b629e1e000ddd0e0e8c234970 to b48610309c8963d03ec5e7c671a84bf6bf868993
Branch pushed to git repo; I updated commit sha1. New commits:
b486103 | add reference
|
comment:30 Changed 4 years ago by
I found that the different order comes from an additional sorted
in SetPartitions_set.__iter__
, so there is nothing to worry about.
I added a reference, I'm happy with positive review.
New commits:
b486103 | add reference
|
comment:31 Changed 4 years ago by
- Commit changed from b48610309c8963d03ec5e7c671a84bf6bf868993 to 77d6902bdbc93e557354d4e9d07a77d4696fc446
Branch pushed to git repo; I updated commit sha1. New commits:
77d6902 | Minor tweaks.
|
comment:32 Changed 4 years ago by
Thank you. I made a few small tweaks to the reference. If my tweaks are good with you, then positive review.
comment:33 Changed 4 years ago by
- Status changed from needs_review to positive_review
comment:35 Changed 4 years ago by
@vbraun: dear Volker, I cannot see the merge conflict...
comment:36 Changed 4 years ago by
It will be with the (hopefully soon) forthcoming 8.4.beta2.
comment:37 Changed 4 years ago by
- Commit changed from 77d6902bdbc93e557354d4e9d07a77d4696fc446 to 25dabd7c587025b7daf31b93da1b4cd5999ca3aa
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
a17239b | Using new iterator in the free Lie algebra.
|
bbacc76 | Added long test to lyndon_word.py.
|
95d5d91 | Merge branch 'public/combinat/speedup_set_partitions-25462' of git://trac.sagemath.org/sage into public/combinat/fast_lyndon_iter-26111
|
8cc9d0e | Using a Cython version of FKM algorithm to generate Lyndon words.
|
7b25ddc | Using faster generator for LyndonWords and fixing doctest in free_lie_algebra.py.
|
b9d8ee4 | Corner case, pyflakes, and some extra doctests.
|
2b3f9d4 | fix doc and rename iterator
|
0b4c68b | A few more doc tweaks and making Ruskey a proper reference.
|
9521ad1 | Merge branch 'public/combinat/speedup_brauer_algebra-25659' of git://trac.sagemath.org/sage into public/combinat/speedup_brauer_algebra-25659
|
25dabd7 | A little doc tweak.
|
comment:38 Changed 4 years ago by
- Dependencies changed from #25462 to #25462 #26111
Hmm...I got a strange merge conflict when I rebased this over #26111. Well, still better to wait for 8.4.beta2.
comment:39 Changed 4 years ago by
- Commit changed from 25dabd7c587025b7daf31b93da1b4cd5999ca3aa to d4fccc7493bdd2f8969611282d08b4527dde5bd9
Branch pushed to git repo; I updated commit sha1. New commits:
d4fccc7 | Merge branch 'develop' of git://trac.sagemath.org/sage into t/25659/public/combinat/speedup_brauer_algebra-25659
|
comment:40 Changed 4 years ago by
- Status changed from needs_work to needs_review
almost trivial rebase
comment:41 Changed 4 years ago by
- Status changed from needs_review to positive_review
comment:43 Changed 4 years ago by
- Commit changed from d4fccc7493bdd2f8969611282d08b4527dde5bd9 to ef607379694c9394a7d4b90b95aadc6db3782830
Branch pushed to git repo; I updated commit sha1. New commits:
ef60737 | Fixing doctests due to output order.
|
comment:44 Changed 4 years ago by
- Status changed from needs_work to needs_review
Martin, can you check that you are getting the doctest order I am (which matches the patchbot)?
comment:45 Changed 4 years ago by
- Status changed from needs_review to positive_review
all tests pass here (on Ubuntu 16.04, Intel(R) Core(TM) i5-4570 CPU). Thanks for fixing the tests and taking initiative!
comment:46 Changed 4 years ago by
- Branch changed from public/combinat/speedup_brauer_algebra-25659 to ef607379694c9394a7d4b90b95aadc6db3782830
- Resolution set to fixed
- Status changed from positive_review to closed
Last 10 new commits:
revert a call to Set which works after #25496
use the fast iterators and correct check
fix (and partially improve) tests in diagram algebras
remove unused imports
adress reviewer's comments: new method number_of_blocks and deprecation of n, partition_diagrams now returns iterator
avoid itertools and modify docstring
pyflakes fixes
Revert "pyflakes fixes"
Merge branch 'u/mantepse/make_setpartition_much_faster' of git://trac.sagemath.org/sage into t/25659/make_braueralgebra_faster
use fast iterator from perfect matchings for Brauer diagrams