Opened 2 years ago

Closed 2 years ago

#29853 closed enhancement (fixed)

Do not go through Python for gf2e dense matrices in word_to_poly

Reported by: Travis Scrimshaw Owned by:
Priority: major Milestone: sage-9.3
Component: performance Keywords: gf2e, matrix
Cc: Martin Albrecht, David Roe, John Palmieri, Xavier Caruso Merged in:
Authors: Travis Scrimshaw Reviewers: Frédéric Chapoton
Report Upstream: N/A Work issues:
Branch: afbd42c (Commits, GitHub, GitLab) Commit: afbd42c8ba0a89ac067b21d1f44685e5fe70990b
Dependencies: Stopgaps:

Status badges

Description

This is called in the very low level get_unsafe, and there is no reason to go through the (Python) parent to get what is something ultimately done by the backend.

Change History (11)

comment:1 Changed 2 years ago by Matthias Köppe

Milestone: sage-9.2sage-9.3

comment:2 Changed 2 years ago by Travis Scrimshaw

Authors: Travis Scrimshaw
Branch: public/performance/gf2e_matrix_direct_get_unsafe-29853
Cc: Martin Albrecht David Roe John Palmieri Xavier Caruso added
Commit: 52b90ff6b2acef6dee806025a07976075dd7a572
Keywords: gf2e matrix added
Status: newneeds_review

With the branch:

sage: K.<a> = GF(2^4)
sage: A = random_matrix(K,100,100)
sage: %timeit [A[i,j] for i in range(100) for j in range(100)]
3.82 ms ± 24.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
sage: K.<a> = GF(2^13)
sage: A = random_matrix(K,100,100)
sage: %timeit [A[i,j] for i in range(100) for j in range(100)]
4.94 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

before:

5.35 ms ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^4
6.59 ms ± 27 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^13

Thus we are seeing a speedup of ~25% via __getitem__. Isolating it to a direct get_unsafe call:

sage: %%cython 
....: from sage.matrix.matrix cimport Matrix 
....: def test_unsafe(A, m, n): 
....:      return [(<Matrix> A).get_unsafe(i,j) for i in range(m) for j in range(n)] 

With branch:

sage: %timeit test_unsafe(A, 100, 100)
2.23 ms ± 52.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^4
3.09 ms ± 12.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^13

vs before:

3.4 ms ± 4.18 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^4
4.61 ms ± 46.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^13

Without the extra overhead, we see a speedup of ~35%.


New commits:

52b90ffMake an ABC for the caches to call fetch_int directly from gf2e matrices.

comment:3 Changed 2 years ago by Frédéric Chapoton

Status: needs_reviewneeds_work

triggers many failing doctests, see patchbots

comment:4 Changed 2 years ago by git

Commit: 52b90ff6b2acef6dee806025a07976075dd7a57274d0c3f357b060445e7071bff6e1ad37e3819b6e

Branch pushed to git repo; I updated commit sha1. New commits:

e302294Fixing doctest failures for large ints.
74d0c3fAdditional speedups for NTL by caching the degree and order in the cache.

comment:5 Changed 2 years ago by git

Commit: 74d0c3f357b060445e7071bff6e1ad37e3819b6eafbd42c8ba0a89ac067b21d1f44685e5fe70990b

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

afbd42cAdditional speedups for NTL by caching the degree and order in the cache.

comment:6 Changed 2 years ago by Travis Scrimshaw

This should fix the doctest failres. I also found some places that I could make some additional optimizations.

With the updated branch, I get additional major speed improvements:

673 µs ± 2.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # 2^4
1.33 ms ± 8.11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # 2^13

So it is roughly a 3x speedup versus the previous commit.

Running with NTL:

4.42 ms ± 11.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^4
4.94 ms ± 34.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^13

versus vanilla with NTL:

8.28 ms ± 49.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^4
9.03 ms ± 103 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # 2^13

comment:7 Changed 2 years ago by Frédéric Chapoton

needs review ? looks good and the bot is green

comment:8 Changed 2 years ago by Travis Scrimshaw

Status: needs_workneeds_review

Ah, yes, sorry. I forgot to set it back to needs review.

comment:9 Changed 2 years ago by Frédéric Chapoton

Reviewers: Frédéric Chapoton
Status: needs_reviewpositive_review

ok, let it be. Nice speedup, thanks for your work.

comment:10 Changed 2 years ago by Travis Scrimshaw

Thank you for the review.

comment:11 Changed 2 years ago by Volker Braun

Branch: public/performance/gf2e_matrix_direct_get_unsafe-29853afbd42c8ba0a89ac067b21d1f44685e5fe70990b
Resolution: fixed
Status: positive_reviewclosed
Note: See TracTickets for help on using tickets.