Opened 10 years ago

Closed 10 years ago

Matrix pivots are cached, should be immutable

Reported by: Owned by: rbeezer jason, was minor sage-4.7 linear algebra matrix pivots sage-4.7.alpha4 John Palmieri Rob Beezer N/A

Description

Oops:

```sage: A=matrix(ZZ, 2, range(4))
sage: p = A.pivots(); p
[0, 1]
sage: p.append(652)
sage: A.pivots()
[0, 1, 652]
```

I'd imagine returning a tuple rather than a list would be the best approach. Though how much chaos will ensue is not clear.

comment:1 Changed 10 years ago by jhpalmieri

• Authors set to John Palmieri
• Keywords matrix pivots added
• Priority changed from major to minor
• Status changed from new to needs_review

Here's a patch; it seemed like the simplest thing to do. This way the "pivots" method returns a copy of the list of pivots, not the mutable list itself.

comment:2 Changed 10 years ago by jhpalmieri

Here is a new version of the patch. I think this deals with every instance where something involving the string "pivot" is fetched from the cache.

There are two other ways to proceed, if we don't like this approach:

• cache the pivots as tuples instead of lists, or
• add a keyword to the "pivots" method just like there currently is for "nonzero_positions". That is, model the definition after this:
```    def nonzero_positions(self, copy=True, column_order=False):
"""
Returns the sorted list of pairs (i,j) such that self[i,j] != 0.

INPUT:

-  ``copy`` - (default: True) It is safe to change the
resulting list (unless you give the option copy=False).
```

I think the question is, how much of a performance hit is there by copying the list, and are there times when we want to use the unsafe, but faster method. I don't know the answer to this. Would the first alternative (caching a tuple) be any better, or would the cached value need to be converted to a list frequently enough that it would slow things down?

comment:3 Changed 10 years ago by rbeezer

Isolating changes at the cache-ing step is a great idea.

As an experiment, I changed all of the list() calls in the patch to tuple(). Here are the observations. In sage/matrix only,

```sage -t  /dev/devel/sage-main/sage/matrix/matrix_cyclo_dense.pyx # 2 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix_integer_dense_hnf.py # 11 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix_modn_dense.pyx # 1 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/strassen.pyx # 18 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix_integer_dense.pyx # 15 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix0.pyx # 6 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix2.pyx # 48 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix_space.py # 1 doctests failed
sage -t  /dev/devel/sage-main/sage/matrix/matrix_modn_sparse.pyx # 2 doctests failed
```

About 17 of these end with:

```    File "/sage/dev/local/lib/python/site-packages/sage/matrix/matrix_integer_dense_hnf.py", line 607, in interleave_matrices
w = cols1 + cols2
TypeError: can only concatenate tuple (not "list") to tuple
```

and this looks like the only failures of this type. (I didn't know you could concatenate two tuples.) Six, or so. are sorts of pivots, which could be replaced by sorted(), I'd guess. The rest seem to be output format (list versus tuple), or unrelated follow-on failures. So really, I don't think it is that bad.

I'll look more closely in the morning.

comment:4 Changed 10 years ago by jhpalmieri

• Status changed from needs_review to needs_work

Right now I'm leaning away from the current patch and perhaps toward the tuple approach. If I add a keyword (which I called "safe" so as not to conflict with the "copy" function):

```sage: A = random_matrix(QQ, 100)
sage: timeit('A.pivots(safe=True)')
625 loops, best of 3: 6.38 µs per loop
sage: timeit('A.pivots(safe=False)')
625 loops, best of 3: 3.13 µs per loop
sage: timeit('A.pivots(safe=False)')
```

Whereas if I use tuples as output (via another modification of the current patch):

```sage: A = random_matrix(QQ, 100)
sage: timeit('A.pivots(use_tuple=False)')
625 loops, best of 3: 3.14 µs per loop
sage: timeit('A.pivots(use_tuple=True)')
625 loops, best of 3: 3.55 µs per loop

sage: timeit('A = random_matrix(QQ, 100); A.pivots(use_tuple=False)')
25 loops, best of 3: 22.2 ms per loop
sage: timeit('A = random_matrix(QQ, 100); A.pivots(use_tuple=True)')
25 loops, best of 3: 22.2 ms per loop
```

I suppose using tuples may cause a slow-down in other places, but the "pivots" method gets slowed down by maybe 10%, as opposed to copying the list, which doubles the time.

I'm marking this as "needs work" because I'm not happy with the slowdown in the current patch.

comment:5 follow-up: ↓ 6 Changed 10 years ago by rbeezer

I'm coming around to a preference for tuples, mostly with the idea that any future pivots that get returned may end up being consistent.

I'm happy to do some of the grunt work on fixing code and doctests, I did a bunch of that recently, so have some recent practice.

Will you be at Sage days today?

comment:6 in reply to: ↑ 5 ; follow-up: ↓ 7 Changed 10 years ago by jhpalmieri

I'm coming around to a preference for tuples, mostly with the idea that any future pivots that get returned may end up being consistent.

I'm happy to do some of the grunt work on fixing code and doctests, I did a bunch of that recently, so have some recent practice.

Will you be at Sage days today?

I was planning to, but things have come up, so I'll be working from home today. I'll work on a patch for this, and if I run out of energy, I'll pass it on to you. How does that sound?

comment:7 in reply to: ↑ 6 Changed 10 years ago by rbeezer

I was planning to, but things have come up, so I'll be working from home today. I'll work on a patch for this, and if I run out of energy, I'll pass it on to you. How does that sound?

Perfect. I'm here all week, all day (except for dinner Wednesday evening). So just holler when I can assist/extend.

(I gave your status report a few minutes ago.)

comment:8 follow-up: ↓ 9 Changed 10 years ago by jhpalmieri

• Status changed from needs_work to needs_review

Okay, here's a new patch. This should return tuples for pivots. Also, everywhere pivots get cached, tuples should be cached, not lists. (This way, when the cached value is retrieved, converting it to a tuple, just to be on the safe side, is really fast: try doing `a = tuple(range(1000)); timeit('tuple(a)')`.)

All doctests pass on sage.math.

comment:9 in reply to: ↑ 8 Changed 10 years ago by rbeezer

• Reviewers set to Rob Beezer
• Status changed from needs_review to positive_review

try doing `a = tuple(range(1000)); timeit('tuple(a)')`.)

```625 loops, best of 3: 193 ns per loop
```

Looks really good, John. Passes long tests and I also did a survey on the patched code with `search_src('cache', 'pivot')` and didn't see anything that was missed (including row pivots).

Should improve the reliability of matrix code, or at least lessen the potential for new bugs. Positive review.

comment:10 Changed 10 years ago by jhpalmieri

Some timing. With the old version:

```sage: random_matrix(QQ, 3).pivots()
[0, 1, 2]
sage: timeit('random_matrix(QQ, 200).pivots()')
5 loops, best of 3: 85.1 ms per loop
```

With the new version:

```sage: random_matrix(QQ, 3).pivots()
(0, 1, 2)
sage: timeit('random_matrix(QQ, 200).pivots()')
5 loops, best of 3: 85.7 ms per loop
```

comment:11 Changed 10 years ago by jdemeyer

• Merged in set to sage-4.7.alpha4
• Resolution set to fixed
• Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.