Opened 2 years ago

Closed 2 years ago

#24122 closed enhancement (fixed)

Use echelonize instead of echelon_form in a few places

Reported by: tscrim Owned by:
Priority: major Milestone: sage-8.1
Component: performance Keywords:
Cc: Merged in:
Authors: Travis Scrimshaw Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: cfbb776 (Commits) Commit: cfbb77680533b3b2badc5177e3a00cea74c7ffa5
Dependencies: Stopgaps:

Description (last modified by tscrim)

We do not need to create another copy of the augmented matrix to keep memory usage down (including putting it in a cache) and because it is faster.

We do this change in __invert__, right_kernel_matrix, and _solve_right_nonsingular_square.

We also can take advantage of the sparse structure and our knowledge of the augmented matrix [I|A-1] when reconstructing the matrix.

Change History (14)

comment:1 Changed 2 years ago by tscrim

  • Authors set to Travis Scrimshaw
  • Branch set to public/linear_algebra/echelonize_in_invert-24122
  • Commit set to d37d983d761414451b5d21ff12b732c5285d159c
  • Status changed from new to needs_review

Bad branch not based off develop. Fixed.

Last edited 2 years ago by tscrim (previous) (diff)

comment:2 Changed 2 years ago by git

  • Commit changed from d37d983d761414451b5d21ff12b732c5285d159c to 3990da44ec263e5b6be38529ba56d92f0b761aa2

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

3990da4Do not create another matrix in __invert__.

comment:3 Changed 2 years ago by git

  • Commit changed from 3990da44ec263e5b6be38529ba56d92f0b761aa2 to 294150fc0e058372cc83be396b188c961536e7a5

Branch pushed to git repo; I updated commit sha1. New commits:

294150fDo different things for sparse and dense matrices from the augmented matrix.

comment:4 Changed 2 years ago by tscrim

  • Component changed from linear algebra to performance
  • Description modified (diff)

This last change nets me ~8% speedup:

sage: %timeit ~(matrix({(200-i,i): 1/i for i in range(1,200,1)}, sparse=True) + 1)
10 loops, best of 3: 159 ms per loop
sage: %timeit ~(matrix({(100-i,i): 1/i for i in range(1,100,1)}, sparse=True) + 1)
10 loops, best of 3: 28.6 ms per loop
sage: %timeit ~(matrix({(20-i,i): 1/i for i in range(1,20,1)}, sparse=True) + 1)
The slowest run took 24.10 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.27 ms per loop

versus

sage: %timeit ~(matrix({(200-i,i): 1/i for i in range(1,200,1)}, sparse=True) + 1)
10 loops, best of 3: 167 ms per loop
sage: %timeit ~(matrix({(100-i,i): 1/i for i in range(1,100,1)}, sparse=True) + 1)
10 loops, best of 3: 30.7 ms per loop
sage: %timeit ~(matrix({(20-i,i): 1/i for i in range(1,20,1)}, sparse=True) + 1)
1000 loops, best of 3: 1.36 ms per loop

comment:5 Changed 2 years ago by tscrim

Also versus develop, which this test is not so good for showing how the first change would improve things as the matrices are relatively tiny:

sage: %timeit ~(matrix({(200-i,i): 1/i for i in range(1,200,1)}, sparse=True) + 1)
10 loops, best of 3: 171 ms per loop
sage: %timeit ~(matrix({(100-i,i): 1/i for i in range(1,100,1)}, sparse=True) + 1)
10 loops, best of 3: 30.2 ms per loop
sage: %timeit ~(matrix({(20-i,i): 1/i for i in range(1,20,1)}, sparse=True) + 1)
The slowest run took 11.65 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.3 ms per loop

comment:6 Changed 2 years ago by tscrim

  • Description modified (diff)
  • Summary changed from Use echelonize instead of echelon_form in __invert__ to Use echelonize instead of echelon_form in a few places

comment:7 Changed 2 years ago by git

  • Commit changed from 294150fc0e058372cc83be396b188c961536e7a5 to d6e5a091afd875c1e0c1df0919f1e5b6d3bc553f

Branch pushed to git repo; I updated commit sha1. New commits:

d6e5a09Use echelonize in two other places.

comment:8 Changed 2 years ago by git

  • Commit changed from d6e5a091afd875c1e0c1df0919f1e5b6d3bc553f to cfbb77680533b3b2badc5177e3a00cea74c7ffa5

Branch pushed to git repo; I updated commit sha1. New commits:

cfbb776Merge branch 'develop' into public/linear_algebra/echelonize_in_invert-24122

comment:9 Changed 2 years ago by vdelecroix

This is a bit weird

+        for i in range(nrows):
+            del data[i,i]
+        data = {(r,c-nrows): data[r,c] for (r,c) in data}

You are modifying data and then override it. If you can not do the modification inplace, the following looks simpler

data = {(r,c-nrows): data[r,c] for (r,c) in data if c >= nrows}

comment:10 Changed 2 years ago by tscrim

Yes, it looks more simple, but it actually takes longer because it has to do the if check on every nonzero object. Whereas the version I have just removes objects from the dict. I suspect this is more of a micro improvement, but that could be a lot of extra checks for a really big sparse matrix (at least, I have a 62001 x 62001 sparse matrix with 1756951 non-zero entries).

Ideally I would like to modify the data in place and just update the indices, but hash tables are not good for doing that. One of these days, someone (me) should implement another data structure for sparse matrices...

comment:11 Changed 2 years ago by vdelecroix

  • Reviewers set to Vincent Delecroix

Oh, I see.

Note that for matrices over ZZ or QQ, there is a custom C datatype which is an array of sparse vectors. This is not ideal but already faster than dictionaries in most contexts.

comment:12 Changed 2 years ago by vdelecroix

  • Status changed from needs_review to positive_review

comment:13 Changed 2 years ago by tscrim

Thanks for the review.

comment:14 Changed 2 years ago by vbraun

  • Branch changed from public/linear_algebra/echelonize_in_invert-24122 to cfbb77680533b3b2badc5177e3a00cea74c7ffa5
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.