#27444 closed enhancement (fixed)
Expose multithreaded fflas-ffpack features
Reported by: | gh-ZHG2017 | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.9 |
Component: | packages: standard | Keywords: | |
Cc: | cpernet | Merged in: | |
Authors: | Hongguang Zhu, Clément Pernet | Reviewers: | Luca De Feo |
Report Upstream: | N/A | Work issues: | |
Branch: | e432cb0 (Commits, GitHub, GitLab) | Commit: | |
Dependencies: | #26932 | Stopgaps: |
Description
Dense linear algebra mod small p uses fflas-ffpack over float or double. This library support multithreading for some routines:
- matrix-matrix multiplication
- PLUQ decomposition
- etc.
This ticket proposes to expose these parallel variants to Sage user.
Change History (35)
comment:1 Changed 3 years ago by
- Branch set to u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features
comment:2 Changed 3 years ago by
- Commit set to e7f4be4220242e4c70faf180912d6a580a30624b
comment:3 Changed 3 years ago by
- Commit changed from e7f4be4220242e4c70faf180912d6a580a30624b to 22e592e5cdae3c12bf45886aaa6194928c9e7c87
Branch pushed to git repo; I updated commit sha1. New commits:
22e592e | fixing indents and templated parallel helper
|
comment:4 Changed 3 years ago by
- Commit changed from 22e592e5cdae3c12bf45886aaa6194928c9e7c87 to eae88f5f92c812f68381846dbeba5fbbf38cd8ad
Branch pushed to git repo; I updated commit sha1. New commits:
eae88f5 | fgemm aggregated into SageMaths with no more compile time error but not yet find out how to run it with more threads
|
comment:5 Changed 3 years ago by
- Commit changed from eae88f5f92c812f68381846dbeba5fbbf38cd8ad to 01de1e16a01f54a6bcfed76e6c6b39e588e954a5
Branch pushed to git repo; I updated commit sha1. New commits:
01de1e1 | Compile and link flags added for enable openmp but no multithreading found
|
comment:6 Changed 3 years ago by
- Commit changed from 01de1e16a01f54a6bcfed76e6c6b39e588e954a5 to 86c203840a152fac507804c138348cca550c451e
Branch pushed to git repo; I updated commit sha1. New commits:
fff3992 | remove MMHelperAlgo
|
d81e3c1 | Merge branch 'u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features' of git://trac.sagemath.org/sage into t/27444/expose_multithreaded_fflas_ffpack_features
|
86c2038 | parallel fgemm working by patching fflas-ffpack
|
comment:7 Changed 3 years ago by
- Commit changed from 86c203840a152fac507804c138348cca550c451e to 449b103c8dadc063933a40cae67ee0f9a379cb2f
Branch pushed to git repo; I updated commit sha1. New commits:
449b103 | fix const error
|
comment:8 Changed 3 years ago by
- Commit changed from 449b103c8dadc063933a40cae67ee0f9a379cb2f to 397c10854ad651aad31f8b47043c04d289c7bc1b
Branch pushed to git repo; I updated commit sha1. New commits:
397c108 | fix error
|
comment:9 Changed 3 years ago by
For the record: matrix multiplication over finite field works in parallel. The following experiment is on a 12 cores Intel skylake-x server:
pernet@retourdest:~/soft/sage$ export OMP_NUM_THREADS=12;./sage ┌────────────────────────────────────────────────────────────────────┐ │ SageMath version 8.7.beta7, Release Date: 2019-03-10 │ │ Using Python 2.7.15. Type "help()" for help. │ └────────────────────────────────────────────────────────────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Warning: this is a prerelease version, and it may be unstable. ┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ sage: a=random_matrix(GF(11),10000) sage: time b=a*a CPU times: user 23.3 s, sys: 405 ms, total: 23.7 s Wall time: 3.3 s
while on the same machine, on master we get:
sage: a=random_matrix(GF(11),10000) sage: time b=a*a CPU times: user 18.4 s, sys: 211 ms, total: 18.6 s Wall time: 18.7 s
The speed-up of 6 for a 12 core machine is below what a self-standing fflas-ffpack achieves. We need to investigate the reasons: alignment, strassen-winograd threshold, numa placement, etc.
comment:10 Changed 3 years ago by
- Milestone changed from sage-8.7 to sage-8.8
Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)
comment:11 Changed 3 years ago by
- Commit changed from 397c10854ad651aad31f8b47043c04d289c7bc1b to 58a5309f4c3da394afa1dae95cac918d9d940262
Branch pushed to git repo; I updated commit sha1. New commits:
58a5309 | Using Sage Parallelism() to let user choose number of threads for fgemm
|
comment:12 Changed 3 years ago by
- Commit changed from 58a5309f4c3da394afa1dae95cac918d9d940262 to 421c853598e7f226ff2d172559fa3596d67f6ad8
Branch pushed to git repo; I updated commit sha1. New commits:
421c853 | Multithreaded fgemv using SageMath Parallelism() seems to be working
|
comment:13 Changed 3 years ago by
- Branch changed from u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features to u/cpernet/expose_multithreaded_fflas_ffpack_features
comment:14 Changed 3 years ago by
- Branch changed from u/cpernet/expose_multithreaded_fflas_ffpack_features to u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features
comment:15 Changed 3 years ago by
- Milestone sage-8.8 deleted
As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).
comment:16 Changed 3 years ago by
- Commit changed from 421c853598e7f226ff2d172559fa3596d67f6ad8 to 2d552b9bd261147757b02b13461e7325b5bf1a9a
Branch pushed to git repo; I updated commit sha1. New commits:
bd8b2a0 | fix 32 compilation failure for linbox-rank
|
ddacbf0 | upgrade to linbox-1.6.3 fflas-ffpack-2.4.1 and givaro-4.1.1
|
f2d1f29 | upgrade to fflas-ffpack-2.4.2 fixing testsuite errors with echlon forms
|
7848156 | upgrade do fflas-ffpack 2.4.3 fixing an error in charpoly autotune
|
c391252 | Merge branch 't/26932/upgrade_to_givaro_4_1_0_fflas_ffpack_2_4_0_linbox_1_6_0' into t/27444/expose_multithreaded_fflas_ffpack_features
|
2d552b9 | pRank pDet and pReducedRowEchelonForm added with multithreading which also have been tested and cleaned up for code review
|
comment:17 Changed 3 years ago by
I confirm that the branch successfullly parallizes det, rank and echelonize. However, in the current version of the branch, the matrix product is no longer parallelized. Could you revert the call to pfgemm?
comment:18 Changed 3 years ago by
- Commit changed from 2d552b9bd261147757b02b13461e7325b5bf1a9a to 6ca42e23280f1562253274f9e0df04b45092304e
Branch pushed to git repo; I updated commit sha1. New commits:
6ca42e2 | Added back pfgemm with the same parallelism like pDet
|
comment:19 Changed 3 years ago by
- Dependencies set to 26932
comment:20 Changed 3 years ago by
- Dependencies changed from 26932 to #26932
comment:21 Changed 3 years ago by
- Status changed from new to needs_review
comment:22 Changed 3 years ago by
- Branch changed from u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features to u/cpernet/expose_multithreaded_fflas_ffpack_features
comment:23 Changed 3 years ago by
- Commit changed from 6ca42e23280f1562253274f9e0df04b45092304e to 371d63cc721ad867b05e902e428879b2671f5e61
comment:24 follow-up: ↓ 27 Changed 3 years ago by
I'm curious about these:
if m*n > 100000: sig_on()
Any reason not to always call sig_on()
?
comment:25 Changed 3 years ago by
Docs need updating.
In parallelism.py
:
EXAMPLES:
The number of processes is initialized to 1 (no parallelization) for each field (only tensor computations are implemented at the moment):
Also, it would be good to document in the matrix reference the availability of parallelism, like it is done in http://doc.sagemath.org/html/en/reference/tensor_free_modules/sage/tensor/modules/free_module_tensor.html#sage.tensor.modules.free_module_tensor.FreeModuleTensor.disp
comment:26 Changed 3 years ago by
Otherwise, I confirm there is no regression with Python 3. If you can fix the docs, this ticket is good for me.
comment:27 in reply to: ↑ 24 Changed 3 years ago by
Replying to defeo:
I'm curious about these:
if m*n > 100000: sig_on()Any reason not to always call
sig_on()
?
I guess the reason is to avoid the overhead of using the sigs when the call is with a tiny instance which runs almost instantly and will therefore have no chance of being interrupted.
comment:28 Changed 3 years ago by
- Commit changed from 371d63cc721ad867b05e902e428879b2671f5e61 to e432cb0e542e6c2d7d5e5495611cb42240b89c17
Branch pushed to git repo; I updated commit sha1. New commits:
e432cb0 | fixing the doc
|
comment:29 Changed 3 years ago by
@defeo: Doc fixed in the above commit. Should be good to go now.
comment:30 Changed 3 years ago by
- Reviewers set to Luca De Feo
- Status changed from needs_review to positive_review
The ticket has no authors?
comment:31 Changed 3 years ago by
comment:33 Changed 3 years ago by
- Branch changed from u/cpernet/expose_multithreaded_fflas_ffpack_features to e432cb0e542e6c2d7d5e5495611cb42240b89c17
- Resolution set to fixed
- Status changed from positive_review to closed
comment:34 Changed 3 years ago by
- Commit e432cb0e542e6c2d7d5e5495611cb42240b89c17 deleted
I don't think this ticket should have removed --disable-openmp
from the fflas_ffpack build without mention, because now it requires OpenMP support in any modules which link to it, and this doesn't seem to work properly.
Branch pushed to git repo; I updated commit sha1. New commits:
Add ParSeqHelper to Sage