Opened 3 years ago
Closed 3 years ago
#19821 closed enhancement (fixed)
Increase speed for Coxeter groups, Weyl groups, and quantum Bruhat graph
Reported by:  tscrim  Owned by:  sagecombinat 

Priority:  major  Milestone:  sage7.2 
Component:  combinatorics  Keywords:  quantum bruhat graph 
Cc:  sagecombinat, aschilling, mshimo, nthiery, darij, chapoton, stumpc5, jipilab  Merged in:  
Authors:  Travis Scrimshaw  Reviewers:  Frédéric Chapoton 
Report Upstream:  N/A  Work issues:  
Branch:  0ce02a6 (Commits)  Commit:  0ce02a6f0a713edfb93db05e8b0f26b793c7932a 
Dependencies:  Stopgaps: 
Description (last modified by )
The primary goal of this ticket is to improve the creation speed for the quantum Bruhat graph. We do this in a number of ways:
 Better management of data associated to
lattice.nonparabolic_positive_roots
.  Implement a (temporary) cache of the lengths of elements.
In addition, we also provide some general speedups to all matrix groups and Coxeter groups that came from looking into the above improvements. The net result is over 12x speedup of the creation of the quantum Bruhat graph:
sage: W = WeylGroup(['D',5], prefix='s') sage: %time G = W.quantum_bruhat_graph() CPU times: user 14 s, sys: 60.6 ms, total: 14 s Wall time: 14 s
whereas previously this took over 3 minutes to compute. The downside is this has a larger memory footprint because of the temporary cache, but repeatedly computing the lengths of the elements was far too expensive.
This also includes a speedup of iterating over the entire Coxeter/Weyl? group.
Change History (35)
comment:1 Changed 3 years ago by
 Branch set to public/combinat/speedup_coxeter_weyl_matrix_groups19821
 Commit set to 0b5fcec3ea079b5240414f4981bd860d7768b3c7
 Status changed from new to needs_review
comment:2 Changed 3 years ago by
 Cc darij chapoton stumpc5 added
 Description modified (diff)
Also, _finite_recognition
was no longer needed since we now have Coxeter types. One question I have for CoxeterGroup
is should the default ring for simplylaced types be ZZ
? The code which compares elements in the universal cyclotomic field to 0 is horribly slow and is the main reason why iteration over an instance of CoxeterGroup
(over the UCF) takes so long. Compare:
sage: WC = CoxeterGroup(['D',4], base_ring=ZZ) sage: %timeit L = [x for x in WC] 100 loops, best of 3: 12.1 ms per loop sage: WC = CoxeterGroup(['D',4]) # base_ring=UniversalCyclotomicField() sage: %timeit L = [x for x in WC] 1 loops, best of 3: 224 ms per loop sage: WC = CoxeterGroup(['D',5], base_ring=ZZ) sage: %timeit L = [x for x in WC] 10 loops, best of 3: 152 ms per loop sage: WC = CoxeterGroup(['D',5]) # base_ring=UniversalCyclotomicField() sage: %timeit L = [x for x in WC] 1 loops, best of 3: 3.97 s per loop
comment:3 Changed 3 years ago by
 Commit changed from 0b5fcec3ea079b5240414f4981bd860d7768b3c7 to f673aec39c5dbb81c65ae72fd70154c5674585f5
Branch pushed to git repo; I updated commit sha1. New commits:
f673aec  A better method to find the minimal length coset representatives.

comment:4 followup: ↓ 5 Changed 3 years ago by
A few questions:
1) What is a quantum Bruhat graph, and where can I read about it?
2) Have the indirect doctests from _finite_recognition
been salvaged?
Also, when changing DiGraph?, can you please add the right format in analogy to http://git.sagemath.org/sage.git/commit/?id=6cb47c00ca2315caf9b873506a0ea167d6706c41 ? Thank you!
comment:5 in reply to: ↑ 4 Changed 3 years ago by
Replying to darij:
1) What is a quantum Bruhat graph, and where can I read about it?
It is the usual Bruhat poset but with added "quantum" edges that are used for many purposes. One particular use in Sage is for specializations of Macdonald polynomials at t=0
. See, e.g.,
2) Have the indirect doctests from
_finite_recognition
been salvaged?
Not exactly, but essentially the same tests in recognize_coxeter_type_from_matrix
in sage.combinat.root_system.coxeter_matrix
. So I would say everything is covered.
Also, when changing DiGraph?, can you please add the right format in analogy to http://git.sagemath.org/sage.git/commit/?id=6cb47c00ca2315caf9b873506a0ea167d6706c41 ? Thank you!
Technically I didn't change the input for the digraph, but I can add it in if you're not going to make other additions.
comment:6 followup: ↓ 11 Changed 3 years ago by
I'm not making any additions today, as I'd have to learn this stuff first (and I don't know if I have time for that) and even then I wouldn't be sure of the thinking behind the removal of __matrix
. (This will be so much easier once I'm in Minneapolis...)
Thanks for the references!
comment:7 Changed 3 years ago by
For UCF comparison, it will be faster after #19825 (it avoids QQbar usage).
comment:8 Changed 3 years ago by
 Commit changed from f673aec39c5dbb81c65ae72fd70154c5674585f5 to af33acf5e347a5fa0aec66025f3837ea728061ce
Branch pushed to git repo; I updated commit sha1. New commits:
af33acf  Added explicit format indicator for DiGraph constructor for Darij.

comment:9 Changed 3 years ago by
 Commit changed from af33acf5e347a5fa0aec66025f3837ea728061ce to e0140371806e13560a59869fd540ef8cbf6e8a8b
Branch pushed to git repo; I updated commit sha1. New commits:
e014037  Making some additional improvements.

comment:10 Changed 3 years ago by
 Commit changed from e0140371806e13560a59869fd540ef8cbf6e8a8b to 1129ab7725b584d019902863ae6949d6996fdbc6
Branch pushed to git repo; I updated commit sha1. New commits:
1129ab7  Removing comment about bottlenecks.

comment:11 in reply to: ↑ 6 ; followup: ↓ 14 Changed 3 years ago by
Replying to darij:
I'm not making any additions today, as I'd have to learn this stuff first (and I don't know if I have time for that) and even then I wouldn't be sure of the thinking behind the removal of
__matrix
. (This will be so much easier once I'm in Minneapolis...)
Added. The __matrix
was just duplicating what matrix()
already does and resulted in quite a big speedup to iteration (it cut it in half IIRC).
Replying to vdelecroix:
For UCF comparison, it will be faster after #19825 (it avoids QQbar usage).
With #19825 and my recent changes, it just takes 4 seconds on my laptop to iterate over D_{5}. Now the biggest speed gain (about 2.5s) will be in improving matrix multiplication over the UCF, as opposed to speeding up the comparisons (about 1.5s)
One takeaway message: if you are working in a simplylaced type for CoxeterGroup
, use ZZ
.
comment:12 Changed 3 years ago by
 Cc jipilab added
comment:13 Changed 3 years ago by
See #16116 for a discussion on speeding up matrices over (universal) cyclotomic fields.
comment:14 in reply to: ↑ 11 ; followup: ↓ 16 Changed 3 years ago by
Replying to tscrim:
One takeaway message: if you are working in a simplylaced type for
CoxeterGroup
, useZZ
.
Or more generally a crystalographic type, I assume?
Thanks for the hard work btw!
comment:15 Changed 3 years ago by
Also: you may want to add a datapoint on #9285 after this.
comment:16 in reply to: ↑ 14 Changed 3 years ago by
Replying to nthiery:
Replying to tscrim:
One takeaway message: if you are working in a simplylaced type for
CoxeterGroup
, useZZ
.Or more generally a crystalographic type, I assume?
No, it only works for simplylaced types as the cos(\pi / m)
needs to be rational:
sage: W = CoxeterGroup(['B',2]) sage: W.gens() ( [ 1 E(8)  E(8)^3] [ 1 0] [ 0 1], [E(8)  E(8)^3 1] )
I also added my datapoint for #9285, but it seems infeasible at this point using CoxeterGroup
. The majority of the time is spent looking through the descents, which has to be done for keeping the constant memory iteration.
16341032 function calls in 8.912 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 155520 2.187 0.000 3.494 0.000 group_element.py:308(_mul_) 574775 1.657 0.000 3.515 0.000 coxeter_group.py:483(has_right_descent) 2997405 0.621 0.000 0.621 0.000 coxeter_group.py:513(<genexpr>) 574775 0.434 0.000 4.002 0.000 coxeter_groups.py:782(has_descent) 1356910 0.418 0.000 0.568 0.000 coxeter_group.py:294(index_set) 155520 0.357 0.000 0.357 0.000 matrix_space.py:142(__classcall__) 155520 0.343 0.000 0.356 0.000 group_element.py:241(__init__) 574775 0.298 0.000 0.881 0.000 {all} 155520 0.254 0.000 0.294 0.000 {method '__copy__' of 'sage.matrix.matrix_integer_dense.Matrix_integer_dense' objects} 155520 0.241 0.000 3.781 0.000 coxeter_groups.py:1498(apply_simple_reflection_right) 155520 0.224 0.000 2.264 0.000 coxeter_groups.py:860(first_descent) 103679 0.210 0.000 8.548 0.000 coxeter_groups.py:282(succ) 51841 0.198 0.000 8.841 0.000 backtrack.py:139(search_forest_iterator) 1356910 0.150 0.000 0.150 0.000 coxeter_matrix.py:779(index_set) 155520 0.145 0.000 0.457 0.000 matrix_space.py:1247(matrix) 51840 0.143 0.000 2.223 0.000 coxeter_groups.py:888(descents) 574775 0.107 0.000 0.107 0.000 {method 'index' of 'tuple' objects} 574775 0.106 0.000 0.106 0.000 {range} 1667950 0.104 0.000 0.104 0.000 {method 'parent' of 'sage.structure.element.Element' objects} 574775 0.076 0.000 0.076 0.000 group_element.py:282(matrix) 155520 0.075 0.000 0.532 0.000 matrix_space.py:423(__call__) 730298 0.072 0.000 0.072 0.000 {isinstance} 1 0.071 0.071 8.912 8.912 <string>:1(<module>) 155520 0.070 0.000 3.851 0.000 coxeter_groups.py:1526(apply_simple_reflection)
Moreover, a nontrivial amount of time is spent gathering the information to run the check for descents:
Timer unit: 1e06 s Total time: 4.06995 s File: /home/travis/sage/local/lib/python2.7/sitepackages/sage/groups/matrix_gps/coxeter_group.py Function: has_right_descent at line 483 Line # Hits Time Per Hit % Time Line Contents ============================================================== ... 509 574775 805826 1.4 19.8 i = self.parent().index_set().index(i) 510 574775 668970 1.2 16.4 n = len(self.parent().index_set()) 511 574775 436368 0.8 10.7 M = self.matrix() 512 574775 314255 0.5 7.7 zero = M.base_ring().zero() 513 574775 1844527 3.2 45.3 return all(M[j,i] <= zero for j in range(n))
So there is definite room for optimization, but it would likely lead to some code duplication. I think Florent's parallelization of SearchForest
and its move to the cython file will help a fair amount. Similarly, we could cythonize MatrixGroupElement_generic
and its superclass MatrixGroupElement_base
. We also get a bit of a penalty for the indirection (I implemented has_right_descent
instead of has_descent
).
However, this data will be rendered obsolete in a moment because I will be pushing a specialized version of descents
and first_descent
.
comment:17 Changed 3 years ago by
 Commit changed from 1129ab7725b584d019902863ae6949d6996fdbc6 to 8509534a8900501aca69724f27920c064dc7e618
Branch pushed to git repo; I updated commit sha1. New commits:
8509534  Implementing custon first_descent() and descent() methods for CoxeterGroup.

comment:18 Changed 3 years ago by
So that dropped my %prun
iteration down to ~6.5 seconds (see #9285 for true timings for E_{6} and E_{7}). However, there is a somewhat disturbing amount of overhead with recomputing initializing data (nearly 1s):
Timer unit: 1e06 s Total time: 2.32224 s File: /home/travis/sage/local/lib/python2.7/sitepackages/sage/groups/matrix_gps/coxeter_group.py Function: first_descent at line 483 Line # Hits Time Per Hit % Time Line Contents ============================================================== 483 def first_descent(self, side = 'right', index_set=None, positive=False): 484 """ ... 491 """ 492 155520 177504 1.1 7.6 M = self.matrix() 493 155520 106485 0.7 4.6 if side != 'right': 494 M = ~M 495 155520 244268 1.6 10.5 I = self.parent().index_set() 496 155520 115970 0.7 5.0 n = len(I) 497 155520 133044 0.9 5.7 zero = M.base_ring().zero() 498 155520 98536 0.6 4.2 if index_set is None: 499 155520 145872 0.9 6.3 index_set = range(n) 500 else: 501 index_set = [I.index(i) for i in index_set] 502 155520 96996 0.6 4.2 if positive: 503 for i in index_set: 504 if any(M[j,i] > zero for j in range(n)): 505 return I[i] 506 else: 507 263735 183335 0.7 7.9 for i in index_set: 508 263735 902137 3.4 38.8 if all(M[j,i] <= zero for j in range(n)): 509 155520 118092 0.8 5.1 return I[i] 510 return None
So if we really wanted to go a bit crazy with things, we should implement some kind of custom iterator that subclasses SearchForest
, as well as avoid converting to/from the index set and use {0,1,...,n}
. Although I would say the next step would be the cythonization of the matrix group element classes if we wanted to try and squeeze more out of it right now.
comment:19 Changed 3 years ago by
comment:20 Changed 3 years ago by
comment:21 Changed 3 years ago by
(Indeed, I just wanted to start some comparisons, but ran into the problem fixed in #19795. I now have to first rebuild Sage to 7.0.beta2
which will take a another day or two before I can start doing the testing...)
comment:22 Changed 3 years ago by
@chapoton I haven't looked too closely at the changes to #11010, but I do suspect it will improve things there.
@stumpc5 In case you didn't catch it, Jeoren had mentioned on sagerelease that you will need to run make distclean && make
for the version bump to 7.0.beta2
(which is why I haven't bumped yet from 7.0.beta1
).
@all At some point we should all sit down and port everything that we need from GAP3 to GAP4/sage/separatelibrary. Although that could likely be quite a project for any one particular package...
comment:23 Changed 3 years ago by
Jeoren had mentioned on sagerelease that you will need to run make distclean && make for the version bump to 7.0.beta2
Yep, realized that after compiling for a few hours.
At some point we should all sit down and port everything that we need from GAP3 to GAP4/sage/separatelibrary. Although that could likely be quite a project for any one particular package...
This seems like an infeasible project for anyone not having many months spare time doing nothing but this while sitting next to someone knowing about the needs of the particular packages. (In short: this seems like an infeasible project.) Nevertheless, it needs to be started at some point anyway...
comment:24 Changed 3 years ago by
 Commit changed from 8509534a8900501aca69724f27920c064dc7e618 to 2fdaef455cf27e35c207ef217353e974e25288ae
Branch pushed to git repo; I updated commit sha1. New commits:
2fdaef4  Forgot the doctests for first_descent.

comment:25 Changed 3 years ago by
 Dependencies set to #19870
comment:26 Changed 3 years ago by
 Commit changed from 2fdaef455cf27e35c207ef217353e974e25288ae to 2e9535ad09e6aa2489848f51f95452d2cbfd5884
comment:27 Changed 3 years ago by
I have now merged in #19870. I also made the generators to CoxeterGroup
be dense matrices for the following reason. The one()
element is dense because the matrix_space()
is dense. By making the generators dense, it means the generators are not converted to dense matrices upon multiplication, resulting in over a 2x speedup to iteration and sparse matrix multiplication is 3x slower in rank 6. Moreover, on average there are 11/18 nonzero entries in E_{6}, so IMO they are not really sparse matrices.
comment:28 Changed 3 years ago by
 Commit changed from 2e9535ad09e6aa2489848f51f95452d2cbfd5884 to 525c753db854563f95180b29c4e52262db7af0ea
Branch pushed to git repo; I updated commit sha1. New commits:
989b6ea  Adding changes from #19821.

6358d53  Cythonized matrix_gp/group_element.py and simplified the class structure.

65ce465  Fixing modform_hecketriangle due to changes.

1110926  Fixing last doctests.

3efe9b6  Merge branch 'develop' into public/groups/cythonize_matrix_group_element19870

29f7253  Special case for dense matrices over ZZ and making sure the inverse is in the base ring.

9582f28  Merge branch 'public/combinat/speedup_coxeter_weyl_matrix_groups19821' into public/groups/cythonize_matrix_group_element19870

525c753  Merge branch 'public/groups/cythonize_matrix_group_element19870' into public/combinat/speedup_coxeter_weyl_matrix_groups19821

comment:29 Changed 3 years ago by
 Commit changed from 525c753db854563f95180b29c4e52262db7af0ea to 9250e99268d90500411c21df4a6afed78865ccb3
Branch pushed to git repo; I updated commit sha1. New commits:
8cac13c  Merge branch 'public/combinat/speedup_coxeter_weyl_matrix_groups19821' of trac.sagemath.org:sage into public/combinat/speedup_coxeter_weyl_matrix_groups19821

5333b0f  Implementing reviewer comments.

274f662  Merge branch 'public/groups/cythonize_matrix_group_element19870' of trac.sagemath.org:sage into public/combinat/speedup_coxeter_weyl_matrix_groups19821

9250e99  Fixing doctest and adding is_one.

comment:30 Changed 3 years ago by
 Commit changed from 9250e99268d90500411c21df4a6afed78865ccb3 to bea6b3c96b42255a8f9459b344c625fe22c337d7
Branch pushed to git repo; I updated commit sha1. New commits:
bea6b3c  Even better is_one by avoiding unnecessary coercions for matrices.

comment:31 Changed 3 years ago by
I fixed a trivial failing doctest and implemented a custom is_one()
, which is much faster (~30 microseconds to a little over 1) with the changes to the matrices (in particular, avoiding the coercion calls).
comment:32 Changed 3 years ago by
 Commit changed from bea6b3c96b42255a8f9459b344c625fe22c337d7 to 0ce02a6f0a713edfb93db05e8b0f26b793c7932a
Branch pushed to git repo; I updated commit sha1. New commits:
f38620a  Check identity rather than equality from coset_representative.

df18b13  Merge branch 'public/groups/cythonize_matrix_group_element19870' of trac.sagemath.org:sage into public/groups/cythonize_matrix_group_element19870

0ce02a6  Merge branch 'public/groups/cythonize_matrix_group_element19870' into public/combinat/speedup_coxeter_weyl_matrix_groups19821

comment:33 Changed 3 years ago by
 Dependencies #19870 deleted
 Milestone changed from sage7.0 to sage7.2
comment:34 Changed 3 years ago by
 Reviewers set to Frédéric Chapoton
 Status changed from needs_review to positive_review
ok, looks good enough to me. Let it be
comment:35 Changed 3 years ago by
 Branch changed from public/combinat/speedup_coxeter_weyl_matrix_groups19821 to 0ce02a6f0a713edfb93db05e8b0f26b793c7932a
 Resolution set to fixed
 Status changed from positive_review to closed
New commits:
Speedup has_right_descent() for Coxeter groups.
Speedup quantum_bruhat_graph by doing some things locally.
Changed libs.gap.element.GapElement.matrix() to avoid matrix constructor.
Use a specialized version of GapElement.matrix() to avoid some overhead.
Don't store matrix() of Weyl group elements and some cleanup.
Added length cache to quantum_bruhat_graph().
Get a little more speed by using the matrices as keys for the length cache.