Opened 4 years ago
Closed 4 years ago
#23352 closed defect (fixed)
Fix random matrix_gfpn_dense
Reported by:  SimonKing  Owned by:  

Priority:  major  Milestone:  sage8.1 
Component:  packages: optional  Keywords:  meataxe random, days88, IMA coding sprints 
Cc:  Merged in:  
Authors:  Simon King  Reviewers:  Jeroen Demeyer, Travis Scrimshaw 
Report Upstream:  N/A  Work issues:  
Branch:  39431d7 (Commits, GitHub, GitLab)  Commit:  39431d7e98a19de1c3b4f0af97440922eed10f44 
Dependencies:  Stopgaps: 
Description (last modified by )
I just noticed that, when meataxe is installed and its Sage wrapper is used for matrices over GF(p^n)
with p
odd and p^n<255
, then at least in some cases the random matrices aren't sufficiently random:
sage: random_matrix(GF(9,'x'), 5) [ x 2*x + 1 x + 2 x + 2 0] [ x + 1 x x + 2 x + 1 0] [ x + 1 2 x + 1 2 0] [ x 2*x x x + 1 0] [2*x + 1 0 2*x + 1 2*x + 2 0] sage: random_matrix(GF(9,'x'), 5) [ 2 x + 1 x + 1 2*x + 2 0] [ x 2*x + 2 x + 2 2*x 0] [ x + 2 2*x 0 2*x + 2 0] [ 2*x 2 2 1 0] [ x + 2 2 2*x 1 0] sage: random_matrix(GF(9,'x'), 5) [ 2 2*x + 1 x 2*x + 1 0] [2*x + 2 0 2*x + 2 x 0] [ 2 x 2*x + 1 x + 2 0] [ 0 2*x + 1 x x 0] [ 1 1 2 x + 1 0] sage: random_matrix(GF(9,'x'), 5) [ x x + 1 0 0 0] [ 1 x + 1 0 2*x + 1 0] [ 2 0 x x + 2 0] [ 2 0 x + 1 x + 2 0] [ x x + 1 2*x 1 0]
So, the last column always is zero, which shouldn't be the case.
Change History (35)
comment:1 Changed 4 years ago by
 Branch set to u/SimonKing/fix_random_matrix_gfpn_dense
comment:2 followup: ↓ 20 Changed 4 years ago by
 Commit set to 3297b211937dd687bd8b0fbf267ac7b52e171022
 Status changed from new to needs_review
comment:3 Changed 4 years ago by
 Dependencies set to #21437
 Status changed from needs_review to needs_work
comment:4 Changed 4 years ago by
 Commit changed from 3297b211937dd687bd8b0fbf267ac7b52e171022 to 6e332268227e5f8abedce6259bb32f8765108df0
Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:
291f4dc  Some minor tweaks for performance

03ce5c0  Fix bad INPUT::/OUTPUT:: blocks

5ed2269  Fix one doctest

d4dc794  Merge branch 'develop' into t/21437/fixmtxmatrices

413e4a9  Remove deprecated Cython syntax, use check_malloc

d943f93  Remove rawMatrix function

c929fcc  Remove unused MTX* environment variables

b70bc5f  Change some str into bytes

04e7986  Improve use of sig_on/off/check

6e33226  Properly fill the last few columns of random meataxe matrices

comment:5 Changed 4 years ago by
 Status changed from needs_work to needs_review
Don't be scared. All but the last commit belong to #21437. Yes, I still believe that git commits *belong* to a ticket...
Anyway, I think it is logical to put a ticket that fixes random meataxe matrices on top of a ticket that fixes other aspects of meataxe matrices.
comment:6 followup: ↓ 11 Changed 4 years ago by
 Description modified (diff)
 Summary changed from Fix random matrix_gfpn_dense to Some bug fixes for matrix_gfpn_dense
Jeroen suggested on #21437 to distribute different aspects of the work in a new way. I think this ticket fits best for the topic "bug fixes". So, I'll rename the ticket a couple of commits. The tests should pass with the branch that I'll forcepush in a minute.
comment:7 Changed 4 years ago by
 Commit changed from 6e332268227e5f8abedce6259bb32f8765108df0 to 225afe1afa8dfa4ab700d3d91d3dca956e2b6ce0
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
b1aef56  Add MeatAxe patch that ensures correct value of FfCurrentRowSizeIo

a44a58a  Fix reading mtxmatrix from nonexistent file

91ac93f  Make pickling machine independent

225afe1  Properly fill the last few columns of random meataxe matrices

comment:8 Changed 4 years ago by
Needs review...
comment:9 Changed 4 years ago by
 Commit changed from 225afe1afa8dfa4ab700d3d91d3dca956e2b6ce0 to bd1f3f13c7a5a5a2af51dd2b19f976515b0f3fd9
Branch pushed to git repo; I updated commit sha1. New commits:
bd1f3f1  Fix ticket number in deprecation warning

comment:10 Changed 4 years ago by
 Dependencies #21437 deleted
comment:11 in reply to: ↑ 6 ; followup: ↓ 12 Changed 4 years ago by
comment:12 in reply to: ↑ 11 Changed 4 years ago by
Replying to jdemeyer:
Replying to SimonKing:
Jeroen suggested on #21437 to distribute different aspects of the work in a new way.
That's not quite what I said. I proposed to split up #21437. Adding more stuff to existing tickets (like this one) is actually the opposite of what I intented.
Do you prefer that I rebase again and split this ticket into three: One for the new MeatAxe patch (that avoids frequent calls to a C function and thus gives a speedup), one for the pickling and one for random matrices?
comment:13 followup: ↓ 14 Changed 4 years ago by
Yes, that would be a good idea.
comment:14 in reply to: ↑ 13 ; followup: ↓ 16 Changed 4 years ago by
Replying to jdemeyer:
Yes, that would be a good idea.
The first one is #23410.
The patches to be distributed are, as much as I see, independent. Would it be better to let the new tickets build upon each other (i.e., the git history of one ticket is included in the git history of the other tickets), or should they stay separate (i.e., each ticket provides a patch, and then they are merged only when needed)?
comment:15 Changed 4 years ago by
 Description modified (diff)
 Status changed from needs_review to needs_work
 Summary changed from Some bug fixes for matrix_gfpn_dense to Fix random matrix_gfpn_dense
comment:16 in reply to: ↑ 14 Changed 4 years ago by
Replying to SimonKing:
The patches to be distributed are, as much as I see, independent. Would it be better to let the new tickets build upon each other (i.e., the git history of one ticket is included in the git history of the other tickets), or should they stay separate (i.e., each ticket provides a patch, and then they are merged only when needed)?
The ticket that is about sanitizing sig_on/off/check and about proper Cython code style has to depend on the other tickets. But I suggest to keep the other tickets separate as much as possible, and will soon push branches accordingly.
comment:17 Changed 4 years ago by
 Commit changed from bd1f3f13c7a5a5a2af51dd2b19f976515b0f3fd9 to bbd6a61761e4c53bd90cef5d9685648dc9012879
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
bbd6a61  Properly fill last column of random matrix_gfpn_dense

comment:18 Changed 4 years ago by
 Status changed from needs_work to needs_review
comment:19 Changed 4 years ago by
In the splitup process, I have already updated #23410, #23411 and #23352, making them small and independent.
About the other parts, I plan this:
 We have #21437 for proper initialisation. I plan to try to make this standalone as well, but only if it cleanly merges with the aforementioned tickets. If it doesn't merge, I'll put it on top of the other tickets.
 We have #23400 for code cleanup. The cleanup will also concern parts of code that are touched by the other tickets. So, it will use them as dependency.
 We have #23399 for some additions (new methods). I very much doubt that it can stay separate, but I'll try. Very likely #23399 will be on top of all other tickets.
Jeroen, do you support my plan?
comment:20 in reply to: ↑ 2 ; followup: ↓ 21 Changed 4 years ago by
Replying to SimonKing:
The matrix entries are densely packed in meataxe. Hence, in some situations, the last byte of a row in memory is only partly filled with data. And in fact the HIGHEST bits of the last byte have to be filled.
The problem was: In the old code, the LOWEST bits of the last byte have been filled.
If I'm reading the documentation at http://www.math.rwthaachen.de/~MTX/doc24/group__ff.html correctly, it should be the LOWEST bytes:
Packing of field elements is used for field orders less than 17. Let q be the field order and m the largest natural number with q^{m}≤256. Then, m elements k_{0},...k_{n1} are packed into one byte as k_{0}+k_{1}q+k_{2}q^{2}+...
comment:21 in reply to: ↑ 20 Changed 4 years ago by
Replying to jdemeyer:
If I'm reading the documentation at http://www.math.rwthaachen.de/~MTX/doc24/group__ff.html correctly, it should be the LOWEST bytes:
Packing of field elements is used for field orders less than 17. Let q be the field order and m the largest natural number with q^{m}≤256. Then, m elements k_{0},...k_{n1} are packed into one byte as k_{0}+k_{1}q+k_{2}q^{2}+...
Yes, that's what I had in mind, too, and is why I wrote the old code in the way I did. Anyway, I am now using meataxe's FfInsert
, which should do the right thing.
Let's test: If I would set the last byte inconsistent with meataxe's conventions, then multiplying a matrix with the transposed matrix (these are all using meataxe functions) would give a wrong result.
sage: MS = MatrixSpace(GF(9,'x'),1,5) sage: def sanity_check(M): ....: assert (M*M.transpose())[0,0] == add(e*e for e in M.list()) ....: sage: while(1): ....: sanity_check(MS.random_element()) ....:
works without error. Note that in the setting above, the new code is called. Thus, I believe my code is correct and indeed the meataxe documentation provides a wrong statement.
comment:22 Changed 4 years ago by
While fixing the randomize
method, I'd like to speed it up, by cdefining the randint function.
comment:23 Changed 4 years ago by
 Commit changed from bbd6a61761e4c53bd90cef5d9685648dc9012879 to 71abd443f2690b8fa20dfc3d177dec03155a7264
Branch pushed to git repo; I updated commit sha1. New commits:
71abd44  Speedup meataxe matrix randomisation

comment:24 Changed 4 years ago by
Before the latest commit I got
sage: MS = MatrixSpace(GF(9,'x'),5,17) sage: %timeit MS.random_element() The slowest run took 96.98 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 5.37 µs per loop
With the latest commit, I get
sage: MS = MatrixSpace(GF(9,'x'),5,17) sage: %timeit MS.random_element() The slowest run took 142.72 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 3.4 µs per loop
I think 35% gain is worth it. Also, since I'm touching the code anyway, I change to using range() in Cython.
comment:25 Changed 4 years ago by
PS: Since I am changing the code anyway, I guess I could also change the use of sig_on/off in that part of the code...
comment:26 Changed 4 years ago by
 Commit changed from 71abd443f2690b8fa20dfc3d177dec03155a7264 to e68140e57f5eb45d7482bd50533ce2d58f99fad6
Branch pushed to git repo; I updated commit sha1. New commits:
e68140e  Replace sig_on/off by sig_check in meataxe matrix randomisation

comment:27 followup: ↓ 28 Changed 4 years ago by
After replacing sig_on/off by sig_check, I get
sage: MS = MatrixSpace(GF(9,'x'),5,17) sage: %timeit MS.random_element() The slowest run took 146.35 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 3.05 µs per loop
That's further 10% on top of the previous 35%.
That's slightly surprising to me. Is sig_on/off really resulting in a massive slowdown? Then why is it used? After all, another disadvantage is that it doesn't play well with Python code.
comment:28 in reply to: ↑ 27 Changed 4 years ago by
Replying to SimonKing:
Is sig_on/off really resulting in a massive slowdown?
It shouldn't result in a massive slowdown. However, if you are using it in a very tight loop, the time taken by sig_on()
is noticable.
comment:29 Changed 4 years ago by
I should also mention that sig_on()
is essentially just a call to sigsetjmp()
. This is a function implemented by libc
from the OS. So the time taken by sig_on()
could depend significantly on the system.
comment:30 Changed 4 years ago by
Can the review please be finished?
comment:31 followup: ↓ 33 Changed 4 years ago by
 Status changed from needs_review to needs_work
The branch does not apply on latest beta.
comment:32 Changed 4 years ago by
 Commit changed from e68140e57f5eb45d7482bd50533ce2d58f99fad6 to 39431d7e98a19de1c3b4f0af97440922eed10f44
Branch pushed to git repo; I updated commit sha1. New commits:
39431d7  Merge branch 'develop' into t/23352/fix_random_matrix_gfpn_dense

comment:33 in reply to: ↑ 31 Changed 4 years ago by
 Status changed from needs_work to needs_review
comment:34 Changed 4 years ago by
 Keywords days88 IMA coding sprints added
 Milestone changed from sage8.0 to sage8.1
 Reviewers set to Jeroen Demeyer, Travis Scrimshaw
 Status changed from needs_review to positive_review
Let it be so.
comment:35 Changed 4 years ago by
 Branch changed from u/SimonKing/fix_random_matrix_gfpn_dense to 39431d7e98a19de1c3b4f0af97440922eed10f44
 Resolution set to fixed
 Status changed from positive_review to closed
The matrix entries are densely packed in meataxe. Hence, in some situations, the last byte of a row in memory is only partly filled with data. And in fact the HIGHEST bits of the last byte have to be filled.
The problem was: In the old code, the LOWEST bits of the last byte have been filled.
I fixed it by explicitly assigning random numbers to the last few entries (instead of trying to fill a whole block of entries).
New commits:
Properly fill the last few columns of random meataxe matrices