Opened 3 years ago

Closed 3 years ago

Last modified 2 years ago

#27444 closed enhancement (fixed)

Expose multithreaded fflas-ffpack features

Reported by: gh-ZHG2017 Owned by:
Priority: major Milestone: sage-8.9
Component: packages: standard Keywords:
Cc: cpernet Merged in:
Authors: Hongguang Zhu, Clément Pernet Reviewers: Luca De Feo
Report Upstream: N/A Work issues:
Branch: e432cb0 (Commits, GitHub, GitLab) Commit:
Dependencies: #26932 Stopgaps:

Status badges

Description

Dense linear algebra mod small p uses fflas-ffpack over float or double. This library support multithreading for some routines:

  • matrix-matrix multiplication
  • PLUQ decomposition
  • etc.

This ticket proposes to expose these parallel variants to Sage user.

Change History (35)

comment:1 Changed 3 years ago by gh-ZHG2017

  • Branch set to u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features

comment:2 Changed 3 years ago by git

  • Commit set to e7f4be4220242e4c70faf180912d6a580a30624b

Branch pushed to git repo; I updated commit sha1. New commits:

e7f4be4Add ParSeqHelper to Sage

comment:3 Changed 3 years ago by git

  • Commit changed from e7f4be4220242e4c70faf180912d6a580a30624b to 22e592e5cdae3c12bf45886aaa6194928c9e7c87

Branch pushed to git repo; I updated commit sha1. New commits:

22e592efixing indents and templated parallel helper

comment:4 Changed 3 years ago by git

  • Commit changed from 22e592e5cdae3c12bf45886aaa6194928c9e7c87 to eae88f5f92c812f68381846dbeba5fbbf38cd8ad

Branch pushed to git repo; I updated commit sha1. New commits:

eae88f5fgemm aggregated into SageMaths with no more compile time error but not yet find out how to run it with more threads

comment:5 Changed 3 years ago by git

  • Commit changed from eae88f5f92c812f68381846dbeba5fbbf38cd8ad to 01de1e16a01f54a6bcfed76e6c6b39e588e954a5

Branch pushed to git repo; I updated commit sha1. New commits:

01de1e1Compile and link flags added for enable openmp but no multithreading found

comment:6 Changed 3 years ago by git

  • Commit changed from 01de1e16a01f54a6bcfed76e6c6b39e588e954a5 to 86c203840a152fac507804c138348cca550c451e

Branch pushed to git repo; I updated commit sha1. New commits:

fff3992remove MMHelperAlgo
d81e3c1Merge branch 'u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features' of git://trac.sagemath.org/sage into t/27444/expose_multithreaded_fflas_ffpack_features
86c2038parallel fgemm working by patching fflas-ffpack

comment:7 Changed 3 years ago by git

  • Commit changed from 86c203840a152fac507804c138348cca550c451e to 449b103c8dadc063933a40cae67ee0f9a379cb2f

Branch pushed to git repo; I updated commit sha1. New commits:

449b103fix const error

comment:8 Changed 3 years ago by git

  • Commit changed from 449b103c8dadc063933a40cae67ee0f9a379cb2f to 397c10854ad651aad31f8b47043c04d289c7bc1b

Branch pushed to git repo; I updated commit sha1. New commits:

397c108fix error

comment:9 Changed 3 years ago by cpernet

For the record: matrix multiplication over finite field works in parallel. The following experiment is on a 12 cores Intel skylake-x server:

pernet@retourdest:~/soft/sage$ export OMP_NUM_THREADS=12;./sage 
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 8.7.beta7, Release Date: 2019-03-10               │
│ Using Python 2.7.15. Type "help()" for help.                       │
└────────────────────────────────────────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Warning: this is a prerelease version, and it may be unstable.     ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
sage: a=random_matrix(GF(11),10000)
sage: time b=a*a
CPU times: user 23.3 s, sys: 405 ms, total: 23.7 s
Wall time: 3.3 s

while on the same machine, on master we get:

sage: a=random_matrix(GF(11),10000)
sage: time b=a*a
CPU times: user 18.4 s, sys: 211 ms, total: 18.6 s
Wall time: 18.7 s

The speed-up of 6 for a 12 core machine is below what a self-standing fflas-ffpack achieves. We need to investigate the reasons: alignment, strassen-winograd threshold, numa placement, etc.

comment:10 Changed 3 years ago by embray

  • Milestone changed from sage-8.7 to sage-8.8

Ticket retargeted after milestone closed (if you don't believe this ticket is appropriate for the Sage 8.8 release please retarget manually)

comment:11 Changed 3 years ago by git

  • Commit changed from 397c10854ad651aad31f8b47043c04d289c7bc1b to 58a5309f4c3da394afa1dae95cac918d9d940262

Branch pushed to git repo; I updated commit sha1. New commits:

58a5309Using Sage Parallelism() to let user choose number of threads for fgemm

comment:12 Changed 3 years ago by git

  • Commit changed from 58a5309f4c3da394afa1dae95cac918d9d940262 to 421c853598e7f226ff2d172559fa3596d67f6ad8

Branch pushed to git repo; I updated commit sha1. New commits:

421c853Multithreaded fgemv using SageMath Parallelism() seems to be working

comment:13 Changed 3 years ago by cpernet

  • Branch changed from u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features to u/cpernet/expose_multithreaded_fflas_ffpack_features

comment:14 Changed 3 years ago by gh-ZHG2017

  • Branch changed from u/cpernet/expose_multithreaded_fflas_ffpack_features to u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features

comment:15 Changed 3 years ago by embray

  • Milestone sage-8.8 deleted

As the Sage-8.8 release milestone is pending, we should delete the sage-8.8 milestone for tickets that are not actively being worked on or that still require significant work to move forward. If you feel that this ticket should be included in the next Sage release at the soonest please set its milestone to the next release milestone (sage-8.9).

comment:16 Changed 3 years ago by git

  • Commit changed from 421c853598e7f226ff2d172559fa3596d67f6ad8 to 2d552b9bd261147757b02b13461e7325b5bf1a9a

Branch pushed to git repo; I updated commit sha1. New commits:

bd8b2a0fix 32 compilation failure for linbox-rank
ddacbf0upgrade to linbox-1.6.3 fflas-ffpack-2.4.1 and givaro-4.1.1
f2d1f29upgrade to fflas-ffpack-2.4.2 fixing testsuite errors with echlon forms
7848156upgrade do fflas-ffpack 2.4.3 fixing an error in charpoly autotune
c391252Merge branch 't/26932/upgrade_to_givaro_4_1_0_fflas_ffpack_2_4_0_linbox_1_6_0' into t/27444/expose_multithreaded_fflas_ffpack_features
2d552b9pRank pDet and pReducedRowEchelonForm added with multithreading which also have been tested and cleaned up for code review

comment:17 Changed 3 years ago by cpernet

I confirm that the branch successfullly parallizes det, rank and echelonize. However, in the current version of the branch, the matrix product is no longer parallelized. Could you revert the call to pfgemm?

comment:18 Changed 3 years ago by git

  • Commit changed from 2d552b9bd261147757b02b13461e7325b5bf1a9a to 6ca42e23280f1562253274f9e0df04b45092304e

Branch pushed to git repo; I updated commit sha1. New commits:

6ca42e2Added back pfgemm with the same parallelism like pDet

comment:19 Changed 3 years ago by cpernet

  • Dependencies set to 26932

comment:20 Changed 3 years ago by cpernet

  • Dependencies changed from 26932 to #26932

comment:21 Changed 3 years ago by cpernet

  • Status changed from new to needs_review

comment:22 Changed 3 years ago by cpernet

  • Branch changed from u/gh-ZHG2017/expose_multithreaded_fflas_ffpack_features to u/cpernet/expose_multithreaded_fflas_ffpack_features

comment:23 Changed 3 years ago by cpernet

  • Commit changed from 6ca42e23280f1562253274f9e0df04b45092304e to 371d63cc721ad867b05e902e428879b2671f5e61

I pushed a fix to a misplaced "noefd" algorithm specification. This seems to fix the broken doctests.


New commits:

b386197Merge branch 'develop' into t/27444/expose_multithreaded_fflas_ffpack_features
371d63cfix noefd default algorithm

comment:24 follow-up: Changed 3 years ago by defeo

I'm curious about these:

if m*n > 100000: sig_on()

Any reason not to always call sig_on()?

comment:25 Changed 3 years ago by defeo

Docs need updating.

In parallelism.py:

EXAMPLES:

The number of processes is initialized to 1 (no parallelization) for each field (only tensor computations are implemented at the moment):

Also, it would be good to document in the matrix reference the availability of parallelism, like it is done in http://doc.sagemath.org/html/en/reference/tensor_free_modules/sage/tensor/modules/free_module_tensor.html#sage.tensor.modules.free_module_tensor.FreeModuleTensor.disp

comment:26 Changed 3 years ago by defeo

Otherwise, I confirm there is no regression with Python 3. If you can fix the docs, this ticket is good for me.

comment:27 in reply to: ↑ 24 Changed 3 years ago by cpernet

Replying to defeo:

I'm curious about these:

if m*n > 100000: sig_on()

Any reason not to always call sig_on()?

I guess the reason is to avoid the overhead of using the sigs when the call is with a tiny instance which runs almost instantly and will therefore have no chance of being interrupted.

comment:28 Changed 3 years ago by git

  • Commit changed from 371d63cc721ad867b05e902e428879b2671f5e61 to e432cb0e542e6c2d7d5e5495611cb42240b89c17

Branch pushed to git repo; I updated commit sha1. New commits:

e432cb0fixing the doc

comment:29 Changed 3 years ago by cpernet

@defeo: Doc fixed in the above commit. Should be good to go now.

comment:30 Changed 3 years ago by defeo

  • Reviewers set to Luca De Feo
  • Status changed from needs_review to positive_review

The ticket has no authors?

comment:31 Changed 3 years ago by cpernet

  • Authors set to Hongguang Zhu, Clement Pernet

comment:32 Changed 3 years ago by chapoton

  • Milestone set to sage-8.9

and no milestone ?

comment:33 Changed 3 years ago by vbraun

  • Branch changed from u/cpernet/expose_multithreaded_fflas_ffpack_features to e432cb0e542e6c2d7d5e5495611cb42240b89c17
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:34 Changed 3 years ago by embray

  • Commit e432cb0e542e6c2d7d5e5495611cb42240b89c17 deleted

I don't think this ticket should have removed --disable-openmp from the fflas_ffpack build without mention, because now it requires OpenMP support in any modules which link to it, and this doesn't seem to work properly.

comment:35 Changed 2 years ago by mkoeppe

  • Authors changed from Hongguang Zhu, Clement Pernet to Hongguang Zhu, Clément Pernet
Note: See TracTickets for help on using tickets.