Opened 4 years ago

Closed 3 years ago

Last modified 2 years ago

#27444 closed enhancement (fixed)

Expose multithreaded fflas-ffpack features

Reported by: gh-ZHG2017 Owned by:
Priority: major Milestone: sage-8.9
Component: packages: standard Keywords:
Cc: Clément Pernet Merged in:
Authors: Hongguang Zhu, Clément Pernet Reviewers: Luca De Feo
Report Upstream: N/A Work issues:
Branch: e432cb0 (Commits, GitHub, GitLab) Commit:
Dependencies: #26932 Stopgaps:

Status badges

Description

Dense linear algebra mod small p uses fflas-ffpack over float or double. This library support multithreading for some routines:

  • matrix-matrix multiplication
  • PLUQ decomposition
  • etc.

This ticket proposes to expose these parallel variants to Sage user.

Change History (35)

comment:1 Changed 4 years ago by gh-ZHG2017

Branch: u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features

comment:2 Changed 4 years ago by git

Commit: e7f4be4220242e4c70faf180912d6a580a30624b

Branch pushed to git repo; I updated commit sha1. New commits:

e7f4be4Add ParSeqHelper to Sage

comment:3 Changed 4 years ago by git

Commit: e7f4be4220242e4c70faf180912d6a580a30624b22e592e5cdae3c12bf45886aaa6194928c9e7c87

Branch pushed to git repo; I updated commit sha1. New commits:

22e592efixing indents and templated parallel helper

comment:4 Changed 4 years ago by git

Commit: 22e592e5cdae3c12bf45886aaa6194928c9e7c87eae88f5f92c812f68381846dbeba5fbbf38cd8ad

Branch pushed to git repo; I updated commit sha1. New commits:

eae88f5fgemm aggregated into SageMaths with no more compile time error but not yet find out how to run it with more threads

comment:5 Changed 4 years ago by git

Commit: eae88f5f92c812f68381846dbeba5fbbf38cd8ad01de1e16a01f54a6bcfed76e6c6b39e588e954a5

Branch pushed to git repo; I updated commit sha1. New commits:

01de1e1Compile and link flags added for enable openmp but no multithreading found

comment:6 Changed 4 years ago by git

Commit: 01de1e16a01f54a6bcfed76e6c6b39e588e954a586c203840a152fac507804c138348cca550c451e

Branch pushed to git repo; I updated commit sha1. New commits:

fff3992remove MMHelperAlgo
d81e3c1Merge branch 'u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features' of git://trac.sagemath.org/sage into t/27444/expose_multithreaded_fflas_ffpack_features
86c2038parallel fgemm working by patching fflas-ffpack

comment:7 Changed 4 years ago by git

Commit: 86c203840a152fac507804c138348cca550c451e449b103c8dadc063933a40cae67ee0f9a379cb2f

Branch pushed to git repo; I updated commit sha1. New commits:

449b103fix const error

comment:8 Changed 4 years ago by git

Commit: 449b103c8dadc063933a40cae67ee0f9a379cb2f397c10854ad651aad31f8b47043c04d289c7bc1b

Branch pushed to git repo; I updated commit sha1. New commits:

397c108fix error

comment:9 Changed 4 years ago by Clément Pernet

For the record: matrix multiplication over finite field works in parallel. The following experiment is on a 12 cores Intel skylake-x server:

pernet@retourdest:~/soft/sage$ export OMP_NUM_THREADS=12;./sage 
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 8.7.beta7, Release Date: 2019-03-10               │
│ Using Python 2.7.15. Type "help()" for help.                       │
└────────────────────────────────────────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Warning: this is a prerelease version, and it may be unstable.     ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
sage: a=random_matrix(GF(11),10000)
sage: time b=a*a
CPU times: user 23.3 s, sys: 405 ms, total: 23.7 s
Wall time: 3.3 s

while on the same machine, on master we get:

sage: a=random_matrix(GF(11),10000)
sage: time b=a*a
CPU times: user 18.4 s, sys: 211 ms, total: 18.6 s
Wall time: 18.7 s

The speed-up of 6 for a 12 core machine is below what a self-standing fflas-ffpack achieves. We need to investigate the reasons: alignment, strassen-winograd threshold, numa placement, etc.

comment:10 Changed 4 years ago by Erik Bray

Milestone: sage-8.7sage-8.8

Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)

comment:11 Changed 4 years ago by git

Commit: 397c10854ad651aad31f8b47043c04d289c7bc1b58a5309f4c3da394afa1dae95cac918d9d940262

Branch pushed to git repo; I updated commit sha1. New commits:

58a5309Using Sage Parallelism() to let user choose number of threads for fgemm

comment:12 Changed 4 years ago by git

Commit: 58a5309f4c3da394afa1dae95cac918d9d940262421c853598e7f226ff2d172559fa3596d67f6ad8

Branch pushed to git repo; I updated commit sha1. New commits:

421c853Multithreaded fgemv using SageMath Parallelism() seems to be working

comment:13 Changed 4 years ago by Clément Pernet

Branch: u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_featuresu/cpernet/expose_multithreaded_fflas_ffpack_features

comment:14 Changed 4 years ago by gh-ZHG2017

Branch: u/cpernet/expose_multithreaded_fflas_ffpack_featuresu/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features

comment:15 Changed 3 years ago by Erik Bray

Milestone: sage-8.8

As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).

comment:16 Changed 3 years ago by git

Commit: 421c853598e7f226ff2d172559fa3596d67f6ad82d552b9bd261147757b02b13461e7325b5bf1a9a

Branch pushed to git repo; I updated commit sha1. New commits:

bd8b2a0fix 32 compilation failure for linbox-rank
ddacbf0upgrade to linbox-1.6.3 fflas-ffpack-2.4.1 and givaro-4.1.1
f2d1f29upgrade to fflas-ffpack-2.4.2 fixing testsuite errors with echlon forms
7848156upgrade do fflas-ffpack 2.4.3 fixing an error in charpoly autotune
c391252Merge branch 't/26932/upgrade_to_givaro_4_1_0_fflas_ffpack_2_4_0_linbox_1_6_0' into t/27444/expose_multithreaded_fflas_ffpack_features
2d552b9pRank pDet and pReducedRowEchelonForm added with multithreading which also have been tested and cleaned up for code review

comment:17 Changed 3 years ago by Clément Pernet

I confirm that the branch successfullly parallizes det, rank and echelonize. However, in the current version of the branch, the matrix product is no longer parallelized. Could you revert the call to pfgemm?

comment:18 Changed 3 years ago by git

Commit: 2d552b9bd261147757b02b13461e7325b5bf1a9a6ca42e23280f1562253274f9e0df04b45092304e

Branch pushed to git repo; I updated commit sha1. New commits:

6ca42e2Added back pfgemm with the same parallelism like pDet

comment:19 Changed 3 years ago by Clément Pernet

Dependencies: 26932

comment:20 Changed 3 years ago by Clément Pernet

Dependencies: 26932#26932

comment:21 Changed 3 years ago by Clément Pernet

Status: newneeds_review

comment:22 Changed 3 years ago by Clément Pernet

Branch: u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_featuresu/cpernet/expose_multithreaded_fflas_ffpack_features

comment:23 Changed 3 years ago by Clément Pernet

Commit: 6ca42e23280f1562253274f9e0df04b45092304e371d63cc721ad867b05e902e428879b2671f5e61

I pushed a fix to a misplaced "noefd" algorithm specification. This seems to fix the broken doctests.


New commits:

b386197Merge branch 'develop' into t/27444/expose_multithreaded_fflas_ffpack_features
371d63cfix noefd default algorithm

comment:24 Changed 3 years ago by Luca De Feo

I'm curious about these:

if m*n > 100000: sig_on()

Any reason not to always call sig_on()?

comment:25 Changed 3 years ago by Luca De Feo

Docs need updating.

In parallelism.py:

EXAMPLES:

The number of processes is initialized to 1 (no parallelization) for each field (only tensor computations are implemented at the moment):

Also, it would be good to document in the matrix reference the availability of parallelism, like it is done in http://doc.sagemath.org/html/en/reference/tensor_free_modules/sage/tensor/modules/free_module_tensor.html#sage.tensor.modules.free_module_tensor.FreeModuleTensor.disp

comment:26 Changed 3 years ago by Luca De Feo

Otherwise, I confirm there is no regression with Python 3. If you can fix the docs, this ticket is good for me.

comment:27 in reply to:  24 Changed 3 years ago by Clément Pernet

Replying to defeo:

I'm curious about these:

if m*n > 100000: sig_on()

Any reason not to always call sig_on()?

I guess the reason is to avoid the overhead of using the sigs when the call is with a tiny instance which runs almost instantly and will therefore have no chance of being interrupted.

comment:28 Changed 3 years ago by git

Commit: 371d63cc721ad867b05e902e428879b2671f5e61e432cb0e542e6c2d7d5e5495611cb42240b89c17

Branch pushed to git repo; I updated commit sha1. New commits:

e432cb0fixing the doc

comment:29 Changed 3 years ago by Clément Pernet

@defeo: Doc fixed in the above commit. Should be good to go now.

comment:30 Changed 3 years ago by Luca De Feo

Reviewers: Luca De Feo
Status: needs_reviewpositive_review

The ticket has no authors?

comment:31 Changed 3 years ago by Clément Pernet

Authors: Hongguang Zhu, Clement Pernet

comment:32 Changed 3 years ago by Frédéric Chapoton

Milestone: sage-8.9

and no milestone ?

comment:33 Changed 3 years ago by Volker Braun

Branch: u/cpernet/expose_multithreaded_fflas_ffpack_featurese432cb0e542e6c2d7d5e5495611cb42240b89c17
Resolution: fixed
Status: positive_reviewclosed

comment:34 Changed 3 years ago by Erik Bray

Commit: e432cb0e542e6c2d7d5e5495611cb42240b89c17

I don't think this ticket should have removed --disable-openmp from the fflas_ffpack build without mention, because now it requires OpenMP support in any modules which link to it, and this doesn't seem to work properly.

comment:35 Changed 2 years ago by Matthias Köppe

Authors: Hongguang Zhu, Clement PernetHongguang Zhu, Clément Pernet
Note: See TracTickets for help on using tickets.