Opened 7 years ago

Closed 7 years ago

#16617 closed enhancement (fixed)

simple echelon matrix iterator

Reported by: vdelecroix Owned by:
Priority: major Milestone: sage-6.3
Component: linear algebra Keywords: matrix
Cc: ncohen Merged in:
Authors: Vincent Delecroix Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: 6908f39 (Commits, GitHub, GitLab) Commit: 6908f39941ec6a4b79373400beb20c03824957cb
Dependencies: Stopgaps:

Status badges

Description (last modified by vdelecroix)

The iteration over subspaces over a given vector space takes infinite time (mainly because of complicated constructor in classes). We create an external "echelon_matrix_iterator" to be used for faster iteration...

More than incredible:

sage: from sage.matrix.echelon_matrix import echelon_matrix_iterator
sage: timeit("for m in echelon_matrix_iterator(GF(3),3,5): pass")
125 loops, best of 3: 3.01 ms per loop
sage: timeit("for _ in VectorSpace(GF(3),5).subspaces(3): pass")
5 loops, best of 3: 346 ms per loop

And there even a "faster" mode which avoids copy

sage: timeit("for m in echelon_matrix_iterator(GF(3),3,5,copy=False): pass")
625 loops, best of 3: 392 µs per loop

Change History (16)

comment:1 Changed 7 years ago by vdelecroix

  • Branch set to u/vdelecroix/16617
  • Commit set to aff5e443d6cdf25e9c16b107b50b7bf200c9e37d
  • Status changed from new to needs_review

New commits:

aff5e44trac #16617: echelon matrix iterator

comment:2 Changed 7 years ago by git

  • Commit changed from aff5e443d6cdf25e9c16b107b50b7bf200c9e37d to f3cc9b7a9979e164565df53fa11dd6f8647a6ee9

Branch pushed to git repo; I updated commit sha1. New commits:

f3cc9b7trac #16617: cythonisation

comment:3 Changed 7 years ago by vdelecroix

  • Description modified (diff)

comment:4 follow-up: Changed 7 years ago by ncohen

  • Status changed from needs_review to needs_info

Helloooooooooooo !

First of all, I don't get why you wrote this in Cython instead of changing the old implementation... Does it change much to the performance ? I wonder if you wouldn't obtain the same crazy speedups simply caching m0 as you do and using itertools instead of VectorSpace.. Or is it that you want to aviod the final matrix multiplication ?

Details:

  • It seems that what you do is iterate on *reduced* echelon form
  • number of columnes
  • Could you add a doctest like that ?
sage: q=71
sage: F = GF(q)
sage: len(list(echelon_matrix_iterator(F, 1, 3, copy=False))) == q**2+q+1
True
sage: len(list(echelon_matrix_iterator(F, 2, 3, copy=False))) == q**2+q+1
True
  • I do not understand why you have to call del O_o

Nathann

comment:5 in reply to: ↑ 4 ; follow-up: Changed 7 years ago by vdelecroix

Replying to ncohen:

Helloooooooooooo !

First of all, I don't get why you wrote this in Cython instead of changing the old implementation... Does it change much to the performance ? I wonder if you wouldn't obtain the same crazy speedups simply caching m0 as you do and using itertools instead of VectorSpace.. Or is it that you want to aviod the final matrix multiplication ?

The main speed up is to get rid of VectorSpaces?. I do not want to use them, so I need a echelon_matrix_iterator. Whether it is cython or not is a matter of micro seconds, so I do not care that much (except that using cython allows to set the cache of matrices with the pivot that can be used later).

Details:

  • It seems that what you do is iterate on *reduced* echelon form

it is!

  • Could you add a doctest like that ?
sage: q=71
sage: F = GF(q)
sage: len(list(echelon_matrix_iterator(F, 1, 3, copy=False))) == q**2+q+1
True
sage: len(list(echelon_matrix_iterator(F, 2, 3, copy=False))) == q**2+q+1
True

Of course.

  • I do not understand why you have to call del O_o

This is a crazy thing in the implementation of itertools. If the only reference to the tuple is owned by the iterator then it is reused. Otherwise you create a fresh one. What you gain is creation time of a tuple...

sage: from itertools import product
sage: timeit("for p in product([0,1,2,3],[3,4,5,6],[1,2,3,4],[1,2,3,4]): pass")
625 loops, best of 3: 21.9 µs per loop
sage: timeit("for p in product([0,1,2,3],[3,4,5,6],[1,2,3,4],[1,2,3,4]): del p")
625 loops, best of 3: 7.81 µs per loop

Vincent

comment:6 in reply to: ↑ 5 ; follow-up: Changed 7 years ago by ncohen

Yo !

The main speed up is to get rid of VectorSpaces?. I do not want to use them, so I need a echelon_matrix_iterator. Whether it is cython or not is a matter of micro seconds, so I do not care that much (except that using cython allows to set the cache of matrices with the pivot that can be used later).

I wondered if it was worth creating a file or if it was better to just change the code right where it was.

This is a crazy thing in the implementation of itertools. If the only reference to the tuple is owned by the iterator then it is reused. Otherwise you create a fresh one. What you gain is creation time of a tuple...

sage: from itertools import product
sage: timeit("for p in product([0,1,2,3],[3,4,5,6],[1,2,3,4],[1,2,3,4]): pass")
625 loops, best of 3: 21.9 µs per loop
sage: timeit("for p in product([0,1,2,3],[3,4,5,6],[1,2,3,4],[1,2,3,4]): del p")
625 loops, best of 3: 7.81 µs per loop

What the hell is that ? O_o

Nathann

comment:7 in reply to: ↑ 6 ; follow-up: Changed 7 years ago by vdelecroix

Replying to ncohen:

Yo !

The main speed up is to get rid of VectorSpaces?. I do not want to use them, so I need a echelon_matrix_iterator. Whether it is cython or not is a matter of micro seconds, so I do not care that much (except that using cython allows to set the cache of matrices with the pivot that can be used later).

I wondered if it was worth creating a file or if it was better to just change the code right where it was.

where it was the code was written to output vector spaces and I do not want to deal with them. So it was the only way to go. What I want to avoid is to create a vector space for each output (this is where most of the time is spent).

Last edited 7 years ago by vdelecroix (previous) (diff)

comment:8 Changed 7 years ago by git

  • Commit changed from f3cc9b7a9979e164565df53fa11dd6f8647a6ee9 to 1d5ca5308d23fe938f0ca32b73c6f675a78948e8

Branch pushed to git repo; I updated commit sha1. New commits:

1d5ca53trac #16617: doctest + remove part of the cache

comment:9 in reply to: ↑ 7 Changed 7 years ago by ncohen

where it was the code was written to output vector spaces and I do not want to deal with them. So it was the only way to go. What I want to avoid is to create a vector space for each output (this is where most of the time is spent).

Oh okayyyyyyyyyy !!!

Nathann

comment:10 follow-up: Changed 7 years ago by ncohen

File "matrix/echelon_matrix.pyx", line 92, in sage.matrix.echelon_matrix.echelon_matrix_iterator
Failed example:
    all(a.is_immutable() and a.echelon_form() is a for a in it)
Expected:
    True
Got:
    False

Also, shouldn't the function be "reduced" echelon too ?

Nathann

comment:11 Changed 7 years ago by git

  • Commit changed from 1d5ca5308d23fe938f0ca32b73c6f675a78948e8 to 6908f39941ec6a4b79373400beb20c03824957cb

Branch pushed to git repo; I updated commit sha1. New commits:

6908f39trac #16617: change the name + doctest

comment:12 in reply to: ↑ 10 Changed 7 years ago by vdelecroix

Replying to ncohen:

File "matrix/echelon_matrix.pyx", line 92, in sage.matrix.echelon_matrix.echelon_matrix_iterator
Failed example:
    all(a.is_immutable() and a.echelon_form() is a for a in it)
Expected:
    True
Got:
    False

Seems that the code is not uniform among the matrices... for some implementation the code of .echelon_form starts with

def echelon_form(self):
    if self.fetch('in_echelon_form'): return self

but it seems to be not the case for matrices over finite field.

Also, shouldn't the function be "reduced" echelon too ?

done.

Vincent

(PS: and after the follow up #16683, changing the implementation to use gray code will dramatically improve the timings as we will avoid copying the matrix m0 over and over)

comment:13 Changed 7 years ago by vdelecroix

(and another: #16684)

comment:14 Changed 7 years ago by ncohen

  • Reviewers set to Nathann Cohen
  • Status changed from needs_info to positive_review

THeeeeeeeeeeeeeen it's good to go !

Nathann

comment:15 Changed 7 years ago by vdelecroix

Thanks!!

Vincent

comment:16 Changed 7 years ago by vbraun

  • Branch changed from u/vdelecroix/16617 to 6908f39941ec6a4b79373400beb20c03824957cb
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.