Ticket #13170 (closed enhancement: fixed)
Speedup the default nonzero test for matrices
| Reported by: | SimonKing | Owned by: | tbd |
|---|---|---|---|
| Priority: | major | Milestone: | sage-5.2 |
| Component: | performance | Keywords: | |
| Cc: | malb | Work issues: | |
| Report Upstream: | N/A | Reviewers: | Javier López Peña |
| Authors: | Simon King | Merged in: | sage-5.2.beta1 |
| Dependencies: | Stopgaps: |
Description
The default __nonzero__ method for matrices as defined in sage/matrix/matrix0.pyx is
def __nonzero__(self): z = self._base_ring(0) cdef Py_ssize_t i, j for i from 0 <= i < self._nrows: for j from 0 <= j < self._ncols: if self.get_unsafe(i,j) != z: return True return False
This default implementation is actually used, over some fields. But it turns out that the default could be considerably improved. Let's define three alternative methods, and compare with the default:
from sage.matrix.matrix0 cimport Matrix from sage.structure.element cimport Element def my_first_bool(Matrix M): cdef Py_ssize_t i, j for i from 0 <= i < M._nrows: for j from 0 <= j < M._ncols: if M.get_unsafe(i,j): return True return False def my_second_bool(Matrix M): cdef Py_ssize_t i, j # avoid some overhead: zero_element may be faster than ...(0) cdef Element z = M._base_ring.zero_element() for i from 0 <= i < M._nrows: for j from 0 <= j < M._ncols: if M.get_unsafe(i,j)!=z: return True return False def my_third_bool(Matrix M): cdef Py_ssize_t i, j cdef Element z = M._base_ring.zero_element() for i from 0 <= i < M._nrows: for j from 0 <= j < M._ncols: # Try to skip coercion if z._cmp_(M.get_unsafe(i,j)) is not 0: return True return False def default_bool(Matrix M): cdef Py_ssize_t i, j z = M._base_ring(0) for i from 0 <= i < M._nrows: for j from 0 <= j < M._ncols: if M.get_unsafe(i,j)!=z: return True return False
GF(2)
A 5- to 10-fold speed-up is possible:
sage: K = GF(2) sage: M = random_matrix(K,1000,1000) sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 625 loops, best of 3: 373 ns per loop sage: %timeit my_second_bool(M) 625 loops, best of 3: 653 ns per loop sage: %timeit my_third_bool(M) 625 loops, best of 3: 2.74 µs per loop sage: %timeit default_bool(M) 625 loops, best of 3: 4.33 µs per loop sage: %timeit bool(M) 625 loops, best of 3: 4.94 µs per loop sage: M*=0 sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 25 loops, best of 3: 16.7 ms per loop sage: %timeit my_second_bool(M) 5 loops, best of 3: 73.2 ms per loop sage: %timeit my_third_bool(M) 5 loops, best of 3: 365 ms per loop sage: %timeit default_bool(M) 5 loops, best of 3: 75.9 ms per loop sage: %timeit bool(M) 5 loops, best of 3: 73.2 ms per loop
GF(4)
A 10- to 20-fold speed-up is possible:
sage: K = GF(4,'z') sage: M = random_matrix(K,1000,1000) sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 625 loops, best of 3: 499 ns per loop sage: %timeit my_second_bool(M) 625 loops, best of 3: 1.51 µs per loop sage: %timeit my_third_bool(M) 625 loops, best of 3: 2.36 µs per loop sage: %timeit default_bool(M) 625 loops, best of 3: 9.59 µs per loop sage: %timeit bool(M) 625 loops, best of 3: 10.3 µs per loop sage: M*=0 sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 25 loops, best of 3: 30.1 ms per loop sage: %timeit my_second_bool(M) 5 loops, best of 3: 466 ms per loop sage: %timeit my_third_bool(M) 5 loops, best of 3: 845 ms per loop sage: %timeit default_bool(M) 5 loops, best of 3: 482 ms per loop sage: %timeit bool(M) 5 loops, best of 3: 473 ms per loop
GF(3)
Sage already uses a specialised implementation of __nonzero__:
sage: K = GF(3) sage: M = random_matrix(K,1000,1000) sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit default_bool(M) 625 loops, best of 3: 33.5 µs per loop sage: %timeit bool(M) 625 loops, best of 3: 566 ns per loop sage: M*=0 sage: %timeit default_bool(M) # Interrupted after a few MINUTES!! sage: %timeit bool(M) 125 loops, best of 3: 4.23 ms per loop
GF(25)
A 20- to 40-fold speed-up is possible:
sage: K = GF(25,'z') sage: M = random_matrix(K,1000,1000) sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 625 loops, best of 3: 408 ns per loop sage: %timeit my_second_bool(M) 625 loops, best of 3: 2.44 µs per loop sage: %timeit my_third_bool(M) 625 loops, best of 3: 3.47 µs per loop sage: %timeit default_bool(M) 625 loops, best of 3: 20.6 µs per loop sage: %timeit bool(M) 625 loops, best of 3: 22.1 µs per loop sage: M*=0 sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 25 loops, best of 3: 16.8 ms per loop sage: %timeit my_second_bool(M) 5 loops, best of 3: 456 ms per loop sage: %timeit my_third_bool(M) 5 loops, best of 3: 819 ms per loop sage: %timeit default_bool(M) 5 loops, best of 3: 446 ms per loop sage: %timeit bool(M) 5 loops, best of 3: 454 ms per loop
Polynomial rings
Over more exotic base rings, a 6- to 40-fold speed-up is possible
sage: K.<x,y> = QQ[] sage: M = random_matrix(K,100,100) sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 625 loops, best of 3: 374 ns per loop sage: %timeit my_second_bool(M) 625 loops, best of 3: 1.03 µs per loop sage: %timeit my_third_bool(M) 625 loops, best of 3: 758 ns per loop sage: %timeit default_bool(M) 625 loops, best of 3: 14.8 µs per loop sage: %timeit bool(M) 625 loops, best of 3: 15.4 µs per loop sage: M*=0 sage: my_first_bool(M) == my_second_bool(M) == my_third_bool(M) == default_bool(M) == bool(M) True sage: %timeit my_first_bool(M) 625 loops, best of 3: 176 µs per loop sage: %timeit my_second_bool(M) 625 loops, best of 3: 764 µs per loop sage: %timeit my_third_bool(M) 125 loops, best of 3: 3.56 ms per loop sage: %timeit default_bool(M) 625 loops, best of 3: 825 µs per loop sage: %timeit bool(M) 625 loops, best of 3: 848 µs per loop
Conclusion
Of course, a custom __nonzero__ method using special C-functions of the matrix backend (as in the case of GF(3)) would be best. However, the above timings suggest to change the default implementation of __nonzero__ as I do in my patch.
Attachments
Change History
comment:1 Changed 11 months ago by SimonKing
- Status changed from new to needs_review
With the patch, the timings become
sage: K = GF(2) sage: M = random_matrix(K,1000,1000) sage: %timeit bool(M) 625 loops, best of 3: 547 ns per loop sage: M*=0 sage: %timeit bool(M) 25 loops, best of 3: 38 ms per loop sage: K = GF(3) sage: M = random_matrix(K,1000,1000) sage: %timeit bool(M) 625 loops, best of 3: 464 ns per loop sage: M*=0 sage: %timeit bool(M) 125 loops, best of 3: 3.61 ms per loop sage: K = GF(4,'z') sage: M = random_matrix(K,1000,1000) sage: %timeit bool(M) 625 loops, best of 3: 665 ns per loop sage: M*=0 sage: %timeit bool(M) 5 loops, best of 3: 67.9 ms per loop sage: K = GF(25,'z') sage: M = random_matrix(K,1000,1000) sage: %timeit bool(M) 625 loops, best of 3: 504 ns per loop sage: M*=0 sage: %timeit bool(M) 25 loops, best of 3: 38.6 ms per loop sage: K.<x,y> = QQ[] sage: M = random_matrix(K,100,100) sage: %timeit bool(M) 625 loops, best of 3: 500 ns per loop sage: M*=0 sage: %timeit bool(M) 625 loops, best of 3: 398 µs per loop
The speed-up is clear, though not as good as expected. Anyway: Needs review!
comment:2 Changed 11 months ago by jlopez
- Status changed from needs_review to positive_review
- Reviewers set to Javier López Peña
I made my own tests running the following code:
for i in [2, 3, 4, 5, 7, 9, 11, 16]: K = GF(i, "z") for j in [100,200,300,500,1000]: M = random_matrix(K, j, j) bool(M) M *= 0 bool(M)
without the patch:
5 loops, best of 3: 3.68 s per loop
with the patch:
5 loops, best of 3: 2.29 s per loop
So there is some speedup. Code looks harmless and tests pass. Positive review!

