Opened 9 years ago
Closed 9 years ago
#10752 closed defect (fixed)
Matrix pivots are cached, should be immutable
Reported by: | rbeezer | Owned by: | jason, was |
---|---|---|---|
Priority: | minor | Milestone: | sage-4.7 |
Component: | linear algebra | Keywords: | matrix pivots |
Cc: | Merged in: | sage-4.7.alpha4 | |
Authors: | John Palmieri | Reviewers: | Rob Beezer |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Oops:
sage: A=matrix(ZZ, 2, range(4)) sage: p = A.pivots(); p [0, 1] sage: p.append(652) sage: A.pivots() [0, 1, 652]
I'd imagine returning a tuple rather than a list would be the best approach. Though how much chaos will ensue is not clear.
Attachments (1)
Change History (12)
comment:1 Changed 9 years ago by
- Keywords matrix pivots added
- Priority changed from major to minor
- Status changed from new to needs_review
comment:2 Changed 9 years ago by
Here is a new version of the patch. I think this deals with every instance where something involving the string "pivot" is fetched from the cache.
There are two other ways to proceed, if we don't like this approach:
- cache the pivots as tuples instead of lists, or
- add a keyword to the "pivots" method just like there currently is for "nonzero_positions". That is, model the definition after this:
def nonzero_positions(self, copy=True, column_order=False): """ Returns the sorted list of pairs (i,j) such that self[i,j] != 0. INPUT: - ``copy`` - (default: True) It is safe to change the resulting list (unless you give the option copy=False).
I think the question is, how much of a performance hit is there by copying the list, and are there times when we want to use the unsafe, but faster method. I don't know the answer to this. Would the first alternative (caching a tuple) be any better, or would the cached value need to be converted to a list frequently enough that it would slow things down?
comment:3 Changed 9 years ago by
Isolating changes at the cache-ing step is a great idea.
As an experiment, I changed all of the list() calls in the patch to tuple(). Here are the observations. In sage/matrix only,
sage -t /dev/devel/sage-main/sage/matrix/matrix_cyclo_dense.pyx # 2 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix_integer_dense_hnf.py # 11 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix_modn_dense.pyx # 1 doctests failed sage -t /dev/devel/sage-main/sage/matrix/strassen.pyx # 18 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix_integer_dense.pyx # 15 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix0.pyx # 6 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix2.pyx # 48 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix_space.py # 1 doctests failed sage -t /dev/devel/sage-main/sage/matrix/matrix_modn_sparse.pyx # 2 doctests failed
About 17 of these end with:
File "/sage/dev/local/lib/python/site-packages/sage/matrix/matrix_integer_dense_hnf.py", line 607, in interleave_matrices w = cols1 + cols2 TypeError: can only concatenate tuple (not "list") to tuple
and this looks like the only failures of this type. (I didn't know you could concatenate two tuples.) Six, or so. are sorts of pivots, which could be replaced by sorted(), I'd guess. The rest seem to be output format (list versus tuple), or unrelated follow-on failures. So really, I don't think it is that bad.
I'll look more closely in the morning.
comment:4 Changed 9 years ago by
- Status changed from needs_review to needs_work
Right now I'm leaning away from the current patch and perhaps toward the tuple approach. If I add a keyword (which I called "safe" so as not to conflict with the "copy" function):
sage: A = random_matrix(QQ, 100) sage: timeit('A.pivots(safe=True)') 625 loops, best of 3: 6.38 µs per loop sage: timeit('A.pivots(safe=False)') 625 loops, best of 3: 3.13 µs per loop sage: timeit('A.pivots(safe=False)')
Whereas if I use tuples as output (via another modification of the current patch):
sage: A = random_matrix(QQ, 100) sage: timeit('A.pivots(use_tuple=False)') 625 loops, best of 3: 3.14 µs per loop sage: timeit('A.pivots(use_tuple=True)') 625 loops, best of 3: 3.55 µs per loop sage: timeit('A = random_matrix(QQ, 100); A.pivots(use_tuple=False)') 25 loops, best of 3: 22.2 ms per loop sage: timeit('A = random_matrix(QQ, 100); A.pivots(use_tuple=True)') 25 loops, best of 3: 22.2 ms per loop
I suppose using tuples may cause a slow-down in other places, but the "pivots" method gets slowed down by maybe 10%, as opposed to copying the list, which doubles the time.
I'm marking this as "needs work" because I'm not happy with the slowdown in the current patch.
comment:5 follow-up: ↓ 6 Changed 9 years ago by
I'm coming around to a preference for tuples, mostly with the idea that any future pivots that get returned may end up being consistent.
I'm happy to do some of the grunt work on fixing code and doctests, I did a bunch of that recently, so have some recent practice.
Will you be at Sage days today?
comment:6 in reply to: ↑ 5 ; follow-up: ↓ 7 Changed 9 years ago by
Replying to rbeezer:
I'm coming around to a preference for tuples, mostly with the idea that any future pivots that get returned may end up being consistent.
I'm happy to do some of the grunt work on fixing code and doctests, I did a bunch of that recently, so have some recent practice.
Will you be at Sage days today?
I was planning to, but things have come up, so I'll be working from home today. I'll work on a patch for this, and if I run out of energy, I'll pass it on to you. How does that sound?
comment:7 in reply to: ↑ 6 Changed 9 years ago by
Replying to jhpalmieri:
I was planning to, but things have come up, so I'll be working from home today. I'll work on a patch for this, and if I run out of energy, I'll pass it on to you. How does that sound?
Perfect. I'm here all week, all day (except for dinner Wednesday evening). So just holler when I can assist/extend.
(I gave your status report a few minutes ago.)
Changed 9 years ago by
comment:8 follow-up: ↓ 9 Changed 9 years ago by
- Status changed from needs_work to needs_review
Okay, here's a new patch. This should return tuples for pivots. Also, everywhere pivots get cached, tuples should be cached, not lists. (This way, when the cached value is retrieved, converting it to a tuple, just to be on the safe side, is really fast: try doing a = tuple(range(1000)); timeit('tuple(a)')
.)
All doctests pass on sage.math.
comment:9 in reply to: ↑ 8 Changed 9 years ago by
- Reviewers set to Rob Beezer
- Status changed from needs_review to positive_review
Replying to jhpalmieri:
try doing
a = tuple(range(1000)); timeit('tuple(a)')
.)
625 loops, best of 3: 193 ns per loop
Looks really good, John. Passes long tests and I also did a survey on the patched code with search_src('cache', 'pivot')
and didn't see anything that was missed (including row pivots).
Should improve the reliability of matrix code, or at least lessen the potential for new bugs. Positive review.
comment:10 Changed 9 years ago by
Some timing. With the old version:
sage: random_matrix(QQ, 3).pivots() [0, 1, 2] sage: timeit('random_matrix(QQ, 200).pivots()') 5 loops, best of 3: 85.1 ms per loop
With the new version:
sage: random_matrix(QQ, 3).pivots() (0, 1, 2) sage: timeit('random_matrix(QQ, 200).pivots()') 5 loops, best of 3: 85.7 ms per loop
comment:11 Changed 9 years ago by
- Merged in set to sage-4.7.alpha4
- Resolution set to fixed
- Status changed from positive_review to closed
Here's a patch; it seemed like the simplest thing to do. This way the "pivots" method returns a copy of the list of pivots, not the mutable list itself.