Opened 6 years ago
Last modified 4 years ago
#18231 needs_work defect
constructing matrices is very slow
Reported by:  vdelecroix  Owned by:  

Priority:  major  Milestone:  sage6.7 
Component:  linear algebra  Keywords:  
Cc:  ncohen  Merged in:  
Authors:  Vincent Delecroix  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  u/vdelecroix/18231 (Commits, GitHub, GitLab)  Commit:  742de47c66db9df0eee3f6200a0b8d2764ab743e 
Dependencies:  #19026  Stopgaps: 
Description (last modified by )
The problem is in the code which looks like the following
def test1(): K = QuadraticField(2) for _ in range(100): i = randint(2,3) j = randint(2,3) m = matrix(K, i, j, range(i*j))
It causes the various matrix spaces to rebuilt almost each time! If we compare with
def test2(): K = QuadraticField(2) M = {(i,j): MatrixSpace(K,i,j) for i in range(2,4) for j in range(2,4)} for _ in range(100): i = randint(2,3) j = randint(2,3) m = M[i,j](range(i*j))
then we got
sage: %timeit test1() 10 loops, best of 3: 55 ms per loop sage: %timeit test2() 100 loops, best of 3: 6.49 ms per loop
This is one of the reasons why the creation of Polyhedron is so slow (see #18213). (This has now improved since the creation of the ticket)
In the branch we:
 replace some old Cython code in
matrix_generic_dense
andmatrix0
and do some optimization  move the function
_get_matrix_class
inMatrixSpace
out of the class (as it depends only on the base ring and the sparseness)  In the class
MatrixSpace
the argument__matrix_class
is now_matrix_class
. That way, it is possible in other part of Sage to optimize the creation of matrices by calling directlysage: M = MatrixSpace(ZZ,3,2) sage: M._matrix_class([1,5,3,1,2,3], coerce=False, copy=False)
 we simplify the main element constructor of matrix, namely
MatrixSpace.matrix
and gain a significant speedup.
follow up: #18258
Change History (18)
comment:1 Changed 6 years ago by
 Branch set to u/vdelecroix/18231
 Commit set to c1c7034ed54c7493a7fedbf6fa9798f23946902c
 Status changed from new to needs_review
comment:2 Changed 6 years ago by
 Commit changed from c1c7034ed54c7493a7fedbf6fa9798f23946902c to da310db9b3b4238121b1a1ad2c3d5e36a5e846b5
Branch pushed to git repo; I updated commit sha1. New commits:
da310db  Trac 18231: typo

comment:3 followup: ↓ 4 Changed 6 years ago by
 Status changed from needs_review to needs_work
Hello Vincent,
In this branch you create functions, remove others, and change a lot of code. There is still a lot of guessing to do on the reviewer's part.
It would be really cool if you could be more verbose about what you do. Really.
I began to read the first commit, and I have several more specific questions:
 You modify a function which takes a
MonoidElement
as input, which you immeditely cast withcdef Matrix right = _right
. Either it is a matrix and it should not readMonoidElement
in the first place, or it is not aMatrix
and you cannot cast it?..
 This is cool and everything to gain miliseconds, but that's not a good way to write public code
 matrix.Matrix.__init__(self, parent) + # the four lines below avoid the following call + # matrix.Matrix.__init__(self, parent) + self._parent = parent + res._ncols = parent.ncols() + res._nrows = parent.nrows() + R = res._base_ring = parent.base_ring()
If the main constructor does not do what you want ty to change it. What if somebody adds a new parameter to matrix? The code has to be copy/pasted in every single function that did not want to call__init__
?
 Why?
cdef set_unsafe(self, Py_ssize_t i, Py_ssize_t j, value):  Py_DECREF(<object>PyList_GET_ITEM(self._entries, i*self._ncols + j))  Py_INCREF(value)  PyList_SET_ITEM(self._entries, i*self._ncols + j, value) + self._entries[i*self._ncols + j] = value cdef get_unsafe(self, Py_ssize_t i, Py_ssize_t j):  return <object>PyList_GET_ITEM(self._entries, i*self._ncols + j) + return self._entries[i*self._ncols + j]
Nathann
comment:4 in reply to: ↑ 3 ; followup: ↓ 7 Changed 6 years ago by
 Description modified (diff)
 Status changed from needs_work to needs_review
Hello Nathann,
Thanks for having a look.
Replying to ncohen:
In this branch you create functions, remove others, and change a lot of code. There is still a lot of guessing to do on the reviewer's part.
It would be really cool if you could be more verbose about what you do. Really.
I made a big effort to separate the commit at least! It is now in the description of the ticket.
I began to read the first commit, and I have several more specific questions:
 You modify a function which takes a
MonoidElement
as input, which you immeditely cast withcdef Matrix right = _right
. Either it is a matrix and it should not readMonoidElement
in the first place, or it is not aMatrix
and you cannot cast it?..
No choice. This is a cdef function defined in structure.element.Element
and you can not change the signature in children classes. I thought it was more readable this way.
 This is cool and everything to gain miliseconds, but that's not a good way to write public code If the main constructor does not do what you want ty to change it. What if somebody adds a new parameter to matrix? The code has to be copy/pasted in every single function that did not want to call
__init__
?
All right. But then there is something wrong in the docstring of matrix0.Matrix
def __init__(...): ... Subclasses of ``Matrix`` can safely skip calling ``Matrix.__init__`` provided they take care of initializing these attributes themselves. ...
 Why?
All this was just old Cython code at the time Cython was not able to understand l[i]
when l
is a cdef list
.
comment:5 Changed 6 years ago by
 Commit changed from da310db9b3b4238121b1a1ad2c3d5e36a5e846b5 to 88525b82a790bb4903ab19757acc7a3acd01da72
Branch pushed to git repo; I updated commit sha1. New commits:
88525b8  Trac 18231: fix documentation

comment:6 Changed 6 years ago by
 Commit changed from 88525b82a790bb4903ab19757acc7a3acd01da72 to 715f49ecab8ae8bb66725be77da51ce6eb8f8738
Branch pushed to git repo; I updated commit sha1. New commits:
715f49e  Trac 18231: fix doctests

comment:7 in reply to: ↑ 4 Changed 6 years ago by
Yoooooo !
I made a big effort to separate the commit at least! It is now in the description of the ticket.
Thanks
No choice. This is a cdef function defined in
structure.element.Element
and you can not change the signature in children classes. I thought it was more readable this way.
Oh, okay!
All right. But then there is something wrong in the docstring of
matrix0.Matrix
Hmmmm.... Can you at least observe the difference? All that this __init__
function does is precisely what you do. Can you measure this function call when your function copies lists around? O_o
All this was just old Cython code at the time Cython was not able to understand
l[i]
whenl
is acdef list
.
Okay!
Nathann
comment:8 Changed 6 years ago by
Okay, I give up. Sorry, I just don't know these classes enough, I barely ever used them, I would do a terrible job as a reviewer.
Nathann
comment:9 Changed 6 years ago by
 Description modified (diff)
comment:10 Changed 6 years ago by
 Status changed from needs_review to needs_work
needs rebase, does not apply
comment:11 Changed 6 years ago by
 Commit changed from 715f49ecab8ae8bb66725be77da51ce6eb8f8738 to d4fc6cd87dc244113d7662b95a3d1c4f10258a36
Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:
2d55497  Trac 18231: constructor + __matrix_class > _matrix_class

5ac117f  Trac 18231: removed unused import

fcd06fa  Trac 18231: get rid of a one year old deprecation

b7bf1aa  Trac 18231: remove a useless function

12c9d0e  Trac 18231: set .__call__() as an alias of .matrix()

d59a9db  Trac 18231: rewrite .matrix()

46d8b6b  Trac 18231: fix dict_to_list

aa7ddb2  Trac 18231: typo

03fe3b1  Trac 18231: fix documentation

d4fc6cd  Trac 18231: fix doctests

comment:12 Changed 6 years ago by
 Status changed from needs_work to needs_review
Whether to see if patchbot is happy.
comment:13 Changed 6 years ago by
 Dependencies set to #18231
 Status changed from needs_review to needs_work
comment:14 Changed 6 years ago by
 Dependencies changed from #18231 to #19026
comment:15 Changed 6 years ago by
 Commit changed from d4fc6cd87dc244113d7662b95a3d1c4f10258a36 to 742de47c66db9df0eee3f6200a0b8d2764ab743e
Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:
268cfc3  Trac 18231: move _get_matrix_class out of the class

66e5068  Trac 18231: constructor + __matrix_class > _matrix_class

c9a82f2  Trac 18231: removed unused import

321d073  Trac 18231: get rid of a one year old deprecation

6111c37  Trac 18231: remove a useless function

cb6b82d  Trac 18231: set .__call__() as an alias of .matrix()

2dd0d09  Trac 18231: rewrite .matrix()

00dd4ac  Trac 18231: fix dict_to_list

07ba681  Trac 18231: fix documentation

742de47  Trac 18231: fix doctests

comment:16 Changed 6 years ago by
 Status changed from needs_work to needs_review
I moved the first commit to another ticket (#19026).
comment:18 Changed 4 years ago by
 Description modified (diff)
I only did some cleanup in
matrix_space.py
andmatrix_generic_dense.pyx
. And there is a substantial gain in speed! The tiny modification in #18213 toNumberField.__cmp__
actually makes a much bigger difference.I did not solve the problem of the matrix space being rebuilt every time.
New commits:
Trac 18231: better code in matrices
Trac 18231: move _get_matrix_class out of the class
Trac 18231: constructor + __matrix_class > _matrix_class
Trac 18231: removed unused import
Trac 18231: get rid of a one year old deprecation
Trac 18231: remove a useless function
Trac 18231: set .__call__() as an alias of .matrix()
Trac 18231: rewrite .matrix()
Trac 18231: fix dict_to_list