Opened 3 years ago

Closed 3 years ago

#23352 closed defect (fixed)

Fix random matrix_gfpn_dense

Reported by: SimonKing Owned by:
Priority: major Milestone: sage-8.1
Component: packages: optional Keywords: meataxe random, days88, IMA coding sprints
Cc: Merged in:
Authors: Simon King Reviewers: Jeroen Demeyer, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 39431d7 (Commits) Commit: 39431d7e98a19de1c3b4f0af97440922eed10f44
Dependencies: Stopgaps:

Description (last modified by SimonKing)

I just noticed that, when meataxe is installed and its Sage wrapper is used for matrices over GF(p^n) with p odd and p^n<255, then at least in some cases the random matrices aren't sufficiently random:

sage: random_matrix(GF(9,'x'), 5)
[      x 2*x + 1   x + 2   x + 2       0]
[  x + 1       x   x + 2   x + 1       0]
[  x + 1       2   x + 1       2       0]
[      x     2*x       x   x + 1       0]
[2*x + 1       0 2*x + 1 2*x + 2       0]
sage: random_matrix(GF(9,'x'), 5)
[      2   x + 1   x + 1 2*x + 2       0]
[      x 2*x + 2   x + 2     2*x       0]
[  x + 2     2*x       0 2*x + 2       0]
[    2*x       2       2       1       0]
[  x + 2       2     2*x       1       0]
sage: random_matrix(GF(9,'x'), 5)
[      2 2*x + 1       x 2*x + 1       0]
[2*x + 2       0 2*x + 2       x       0]
[      2       x 2*x + 1   x + 2       0]
[      0 2*x + 1       x       x       0]
[      1       1       2   x + 1       0]
sage: random_matrix(GF(9,'x'), 5)
[      x   x + 1       0       0       0]
[      1   x + 1       0 2*x + 1       0]
[      2       0       x   x + 2       0]
[      2       0   x + 1   x + 2       0]
[      x   x + 1     2*x       1       0]

So, the last column always is zero, which shouldn't be the case.

Change History (35)

comment:1 Changed 3 years ago by SimonKing

  • Branch set to u/SimonKing/fix_random_matrix_gfpn_dense

comment:2 follow-up: Changed 3 years ago by SimonKing

  • Authors set to Simon King
  • Commit set to 3297b211937dd687bd8b0fbf267ac7b52e171022
  • Status changed from new to needs_review

The matrix entries are densely packed in meataxe. Hence, in some situations, the last byte of a row in memory is only partly filled with data. And in fact the HIGHEST bits of the last byte have to be filled.

The problem was: In the old code, the LOWEST bits of the last byte have been filled.

I fixed it by explicitly assigning random numbers to the last few entries (instead of trying to fill a whole block of entries).


New commits:

3297b21Properly fill the last few columns of random meataxe matrices

comment:3 Changed 3 years ago by SimonKing

  • Dependencies set to #21437
  • Status changed from needs_review to needs_work

Sooner or later this ticket needs to be merged with #21437. There is a merge conflict, and git mergetool doesn't work reasonably (namely: It shows me numerous conflicts, although the diff from this ticket is very small).

Therefore I will *rebase* this ticket on top of #21437 and force-push it.

comment:4 Changed 3 years ago by git

  • Commit changed from 3297b211937dd687bd8b0fbf267ac7b52e171022 to 6e332268227e5f8abedce6259bb32f8765108df0

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

291f4dcSome minor tweaks for performance
03ce5c0Fix bad INPUT::/OUTPUT:: blocks
5ed2269Fix one doctest
d4dc794Merge branch 'develop' into t/21437/fix-mtx-matrices
413e4a9Remove deprecated Cython syntax, use check_malloc
d943f93Remove rawMatrix function
c929fccRemove unused MTX* environment variables
b70bc5fChange some str into bytes
04e7986Improve use of sig_on/off/check
6e33226Properly fill the last few columns of random meataxe matrices

comment:5 Changed 3 years ago by SimonKing

  • Status changed from needs_work to needs_review

Don't be scared. All but the last commit belong to #21437. Yes, I still believe that git commits *belong* to a ticket...

Anyway, I think it is logical to put a ticket that fixes random meataxe matrices on top of a ticket that fixes other aspects of meataxe matrices.

Last edited 3 years ago by SimonKing (previous) (diff)

comment:6 follow-up: Changed 3 years ago by SimonKing

  • Description modified (diff)
  • Summary changed from Fix random matrix_gfpn_dense to Some bug fixes for matrix_gfpn_dense

Jeroen suggested on #21437 to distribute different aspects of the work in a new way. I think this ticket fits best for the topic "bug fixes". So, I'll rename the ticket a couple of commits. The tests should pass with the branch that I'll force-push in a minute.

comment:7 Changed 3 years ago by git

  • Commit changed from 6e332268227e5f8abedce6259bb32f8765108df0 to 225afe1afa8dfa4ab700d3d91d3dca956e2b6ce0

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b1aef56Add MeatAxe patch that ensures correct value of FfCurrentRowSizeIo
a44a58aFix reading mtx-matrix from non-existent file
91ac93fMake pickling machine independent
225afe1Properly fill the last few columns of random meataxe matrices

comment:8 Changed 3 years ago by SimonKing

Needs review...

comment:9 Changed 3 years ago by git

  • Commit changed from 225afe1afa8dfa4ab700d3d91d3dca956e2b6ce0 to bd1f3f13c7a5a5a2af51dd2b19f976515b0f3fd9

Branch pushed to git repo; I updated commit sha1. New commits:

bd1f3f1Fix ticket number in deprecation warning

comment:10 Changed 3 years ago by SimonKing

  • Dependencies #21437 deleted

comment:11 in reply to: ↑ 6 ; follow-up: Changed 3 years ago by jdemeyer

Replying to SimonKing:

Jeroen suggested on #21437 to distribute different aspects of the work in a new way.

That's not quite what I said. I proposed to split up #21437. Adding more stuff to existing tickets (like this one) is actually the opposite of what I intented.

comment:12 in reply to: ↑ 11 Changed 3 years ago by SimonKing

Replying to jdemeyer:

Replying to SimonKing:

Jeroen suggested on #21437 to distribute different aspects of the work in a new way.

That's not quite what I said. I proposed to split up #21437. Adding more stuff to existing tickets (like this one) is actually the opposite of what I intented.

Do you prefer that I rebase again and split this ticket into three: One for the new MeatAxe patch (that avoids frequent calls to a C function and thus gives a speedup), one for the pickling and one for random matrices?

comment:13 follow-up: Changed 3 years ago by jdemeyer

Yes, that would be a good idea.

comment:14 in reply to: ↑ 13 ; follow-up: Changed 3 years ago by SimonKing

Replying to jdemeyer:

Yes, that would be a good idea.

The first one is #23410.

The patches to be distributed are, as much as I see, independent. Would it be better to let the new tickets build upon each other (i.e., the git history of one ticket is included in the git history of the other tickets), or should they stay separate (i.e., each ticket provides a patch, and then they are merged only when needed)?

comment:15 Changed 3 years ago by SimonKing

  • Description modified (diff)
  • Status changed from needs_review to needs_work
  • Summary changed from Some bug fixes for matrix_gfpn_dense to Fix random matrix_gfpn_dense

comment:16 in reply to: ↑ 14 Changed 3 years ago by SimonKing

Replying to SimonKing:

The patches to be distributed are, as much as I see, independent. Would it be better to let the new tickets build upon each other (i.e., the git history of one ticket is included in the git history of the other tickets), or should they stay separate (i.e., each ticket provides a patch, and then they are merged only when needed)?

The ticket that is about sanitizing sig_on/off/check and about proper Cython code style has to depend on the other tickets. But I suggest to keep the other tickets separate as much as possible, and will soon push branches accordingly.

comment:17 Changed 3 years ago by git

  • Commit changed from bd1f3f13c7a5a5a2af51dd2b19f976515b0f3fd9 to bbd6a61761e4c53bd90cef5d9685648dc9012879

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

bbd6a61Properly fill last column of random matrix_gfpn_dense

comment:18 Changed 3 years ago by SimonKing

  • Status changed from needs_work to needs_review

comment:19 Changed 3 years ago by SimonKing

In the splitup process, I have already updated #23410, #23411 and #23352, making them small and independent.

About the other parts, I plan this:

  • We have #21437 for proper initialisation. I plan to try to make this stand-alone as well, but only if it cleanly merges with the afore-mentioned tickets. If it doesn't merge, I'll put it on top of the other tickets.
  • We have #23400 for code cleanup. The cleanup will also concern parts of code that are touched by the other tickets. So, it will use them as dependency.
  • We have #23399 for some additions (new methods). I very much doubt that it can stay separate, but I'll try. Very likely #23399 will be on top of all other tickets.

Jeroen, do you support my plan?

comment:20 in reply to: ↑ 2 ; follow-up: Changed 3 years ago by jdemeyer

Replying to SimonKing:

The matrix entries are densely packed in meataxe. Hence, in some situations, the last byte of a row in memory is only partly filled with data. And in fact the HIGHEST bits of the last byte have to be filled.

The problem was: In the old code, the LOWEST bits of the last byte have been filled.

If I'm reading the documentation at http://www.math.rwth-aachen.de/~MTX/doc24/group__ff.html correctly, it should be the LOWEST bytes:

Packing of field elements is used for field orders less than 17. Let q be the field order and m the largest natural number with qm≤256. Then, m elements k0,...kn-1 are packed into one byte as k0+k1q+k2q2+...

comment:21 in reply to: ↑ 20 Changed 3 years ago by SimonKing

Replying to jdemeyer:

If I'm reading the documentation at http://www.math.rwth-aachen.de/~MTX/doc24/group__ff.html correctly, it should be the LOWEST bytes:

Packing of field elements is used for field orders less than 17. Let q be the field order and m the largest natural number with qm≤256. Then, m elements k0,...kn-1 are packed into one byte as k0+k1q+k2q2+...

Yes, that's what I had in mind, too, and is why I wrote the old code in the way I did. Anyway, I am now using meataxe's FfInsert, which should do the right thing.

Let's test: If I would set the last byte inconsistent with meataxe's conventions, then multiplying a matrix with the transposed matrix (these are all using meataxe functions) would give a wrong result.

sage: MS = MatrixSpace(GF(9,'x'),1,5)
sage: def sanity_check(M):
....:     assert (M*M.transpose())[0,0] == add(e*e for e in M.list())
....:     
sage: while(1):
....:     sanity_check(MS.random_element())
....:     

works without error. Note that in the setting above, the new code is called. Thus, I believe my code is correct and indeed the meataxe documentation provides a wrong statement.

Last edited 3 years ago by SimonKing (previous) (diff)

comment:22 Changed 3 years ago by SimonKing

While fixing the randomize method, I'd like to speed it up, by cdefining the randint function.

comment:23 Changed 3 years ago by git

  • Commit changed from bbd6a61761e4c53bd90cef5d9685648dc9012879 to 71abd443f2690b8fa20dfc3d177dec03155a7264

Branch pushed to git repo; I updated commit sha1. New commits:

71abd44Speedup meataxe matrix randomisation

comment:24 Changed 3 years ago by SimonKing

Before the latest commit I got

sage: MS = MatrixSpace(GF(9,'x'),5,17)
sage: %timeit MS.random_element()
The slowest run took 96.98 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.37 µs per loop

With the latest commit, I get

sage: MS = MatrixSpace(GF(9,'x'),5,17)
sage: %timeit MS.random_element()
The slowest run took 142.72 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.4 µs per loop

I think 35% gain is worth it. Also, since I'm touching the code anyway, I change to using range() in Cython.

comment:25 Changed 3 years ago by SimonKing

PS: Since I am changing the code anyway, I guess I could also change the use of sig_on/off in that part of the code...

comment:26 Changed 3 years ago by git

  • Commit changed from 71abd443f2690b8fa20dfc3d177dec03155a7264 to e68140e57f5eb45d7482bd50533ce2d58f99fad6

Branch pushed to git repo; I updated commit sha1. New commits:

e68140eReplace sig_on/off by sig_check in meataxe matrix randomisation

comment:27 follow-up: Changed 3 years ago by SimonKing

After replacing sig_on/off by sig_check, I get

sage: MS = MatrixSpace(GF(9,'x'),5,17)
sage: %timeit MS.random_element()
The slowest run took 146.35 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.05 µs per loop

That's further 10% on top of the previous 35%.

That's slightly surprising to me. Is sig_on/off really resulting in a massive slow-down? Then why is it used? After all, another disadvantage is that it doesn't play well with Python code.

comment:28 in reply to: ↑ 27 Changed 3 years ago by jdemeyer

Replying to SimonKing:

Is sig_on/off really resulting in a massive slow-down?

It shouldn't result in a massive slowdown. However, if you are using it in a very tight loop, the time taken by sig_on() is noticable.

comment:29 Changed 3 years ago by jdemeyer

I should also mention that sig_on() is essentially just a call to sigsetjmp(). This is a function implemented by libc from the OS. So the time taken by sig_on() could depend significantly on the system.

comment:30 Changed 3 years ago by SimonKing

Can the review please be finished?

comment:31 follow-up: Changed 3 years ago by chapoton

  • Status changed from needs_review to needs_work

The branch does not apply on latest beta.

comment:32 Changed 3 years ago by git

  • Commit changed from e68140e57f5eb45d7482bd50533ce2d58f99fad6 to 39431d7e98a19de1c3b4f0af97440922eed10f44

Branch pushed to git repo; I updated commit sha1. New commits:

39431d7Merge branch 'develop' into t/23352/fix_random_matrix_gfpn_dense

comment:33 in reply to: ↑ 31 Changed 3 years ago by SimonKing

  • Status changed from needs_work to needs_review

Replying to chapoton:

The branch does not apply on latest beta.

Thanks. Fixed now...

comment:34 Changed 3 years ago by tscrim

  • Keywords days88 IMA coding sprints added
  • Milestone changed from sage-8.0 to sage-8.1
  • Reviewers set to Jeroen Demeyer, Travis Scrimshaw
  • Status changed from needs_review to positive_review

Let it be so.

comment:35 Changed 3 years ago by vbraun

  • Branch changed from u/SimonKing/fix_random_matrix_gfpn_dense to 39431d7e98a19de1c3b4f0af97440922eed10f44
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.