Opened 8 years ago

Closed 8 years ago

# Improve efficiency of minors() for BinaryMatroid, TernaryMatroid, QuaternaryMatroid

Reported by: Owned by: Rudi Rudi major sage-6.8 matroid theory chaoxu, Stefan, yomcat Rudi Pendavingh Travis Scrimshaw N/A a5fd2f5 a5fd2f59b7ad6a827e358bc294f560c6ad6309a8

### Description

The representing matrices of BinaryMatroid?, TernaryMatroid?, QuaternaryMatroid?, are bitpacket. Taking minors, involves constructing a submatrix of such a representing matrix. Since the rows are bitpacked, the relocation of columns in particular is relatively inefficient.

By allowing a submatrix in which the columns are allowed to be permuted, it is possible to reduce the number of column relocations to at most the number of deleted columns, and this will be far more efficient than the current implementation, especially if few columns are deleted.

### comment:1 Changed 8 years ago by Rudi

Still need to try this out. Hope to do this this week.

Since a cocircuit of a binary matroid M is typically going to have size about `r^*(M) /2` elements, I expect a speedup by a factor of up to `r^*(M)/2/|E(M)|` for the minor-taking in Chao's code in ticket #18539.

### comment:2 Changed 8 years ago by Rudi

Branch: → u/Rudi/improve_efficiency_of_minors___for_binarymatroid__ternarymatroid__quaternarymatroid

### comment:3 Changed 8 years ago by Rudi

Commit: → 7deb135d390d4a83ab1fff1742c0015b5b5194ef

First worked on BinaryMatroid? and BinaryMatrix?. The new code seems to be correct:

```sage: N = Matroid(MatrixSpace(GF(2), 5,12).random_element())
sage: for X in subsets(N.groundset()):
if not BasisMatroid(N\X).equals(BasisMatroid(N)\X):
print X
```

It's more efficient too. Without the patch:

```sage: M = Matroid(MatrixSpace(GF(2), 500,1000).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 3.69 s per loop
```

With the patch:

```sage: M = Matroid(MatrixSpace(GF(2), 500,1000).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 692 ms per loop
```

So, about 5 times faster on this example. This is not all due to the trick announced in the ticket. Replacing a list by a c-style array here and there did some good too. And just replacing the line

```C = [c for c in range(len(self._E)) if self._E[c] not in deletions | contractions]
```

in `_minor` by something that does not compute the union `deletions | contractions` for each c made the code 2 times faster right there.

New commits:

 ​7deb135 `Added new method for taking submatrix of a BinaryMatrix, used in BinaryMatroid._minor()`

### comment:4 Changed 8 years ago by git

Branch pushed to git repo; I updated commit sha1. New commits:

 ​3ea8ac5 `Improved Ternary and Quaternary along the same lines`

### comment:5 Changed 8 years ago by Rudi

Correctness:

```sage: N = Matroid(MatrixSpace(GF(2), 5,10).random_element())
sage: for X in subsets(N.groundset()):
for Y in subsets(N.groundset()-set(X)):
if not BasisMatroid(N.minor(X,Y)).equals(BasisMatroid(N).minor(X,Y)):
print X,Y
....:
sage: N = Matroid(MatrixSpace(GF(3), 5,10).random_element())
sage: for X in subsets(N.groundset()):
for Y in subsets(N.groundset()-set(X)):
if not BasisMatroid(N.minor(X,Y)).equals(BasisMatroid(N).minor(X,Y)):
print X,Y
....:
sage: N = Matroid(MatrixSpace(GF(4,x), 5,10).random_element())
sage: for X in subsets(N.groundset()):
for Y in subsets(N.groundset()-set(X)):
if not BasisMatroid(N.minor(X,Y)).equals(BasisMatroid(N).minor(X,Y)):
print X,Y
```

Efficiency before patch:

```sage: M = Matroid(MatrixSpace(GF(2), 200,400).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 261 ms per loop
sage: M = Matroid(MatrixSpace(GF(3), 200,400).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 7.13 s per loop
sage: M = Matroid(MatrixSpace(GF(4,x), 200,400).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 3.95 s per loop
```

Efficiency after patch:

```sage: M = Matroid(MatrixSpace(GF(2), 200,400).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 73.7 ms per loop
sage: M = Matroid(MatrixSpace(GF(3), 200,400).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 86.7 ms per loop
sage: M = Matroid(MatrixSpace(GF(4,x), 200,400).random_element())
sage: B = M.basis()
sage: timeit("for e in B: M.delete(M._fundamental_cocircuit(B,e))")
5 loops, best of 3: 86.4 ms per loop
```

For TernaryMatrix? and QuaternaryMatrix?, there was no optimized submatrix-code to begin with, hence the 40 - 80 times speedup. I scaled down the size of the matrix a bit, the ternary benchmark simply wouldn't finish for 500x1000.

I will remove some lingering cython profile directives, and then I'm done.

### comment:6 Changed 8 years ago by git

Branch pushed to git repo; I updated commit sha1. New commits:

 ​20dc896 `Cleaned up profiler directives`

### comment:7 Changed 8 years ago by Rudi

Authors: → Rudi Pendavingh new → needs_review

### comment:8 Changed 8 years ago by git

Branch pushed to git repo; I updated commit sha1. New commits:

 ​a5fd2f5 `removed trailing whitespace`

### comment:9 Changed 8 years ago by Rudi

Before:

```sage: M = Matroid(MatrixSpace(GF(3), 500,1000).random_element())
sage: timeit("M.delete()")
5 loops, best of 3: 333 ms per loop
```

After:

```sage: M = Matroid(MatrixSpace(GF(3), 500,1000).random_element())
sage: timeit("M.delete()")
625 loops, best of 3: 655 µs per loop
```

### comment:10 follow-up:  11 Changed 8 years ago by tscrim

Reviewers: → Travis Scrimshaw

LGTM. I'm slightly unhappy with the fact that it looks like a lot of logic is duplicated across the multiple `matrix_from_rows_and_columns_reordered`, but I don't see a way to really combine the common logic. So positive review.

### comment:11 in reply to:  10 Changed 8 years ago by Rudi

Status: needs_review → positive_review

LGTM. I'm slightly unhappy with the fact that it looks like a lot of logic is duplicated across the multiple `matrix_from_rows_and_columns_reordered`, but I don't see a way to really combine the common logic. So positive review.