Ticket #3724 (closed enhancement: fixed)

Opened 20 months ago

Last modified 19 months ago

faster hashs for Matrix_mod2_dense

Reported by: malb Owned by: malb
Priority: major Milestone: sage-3.1.2
Component: linear algebra Keywords: m4ri, hash, matrix
Cc: SimonKing Author(s):
Report Upstream: Reviewer(s):
Merged in: Work issues:

Description

Simon King requested faster hashing for matrices over GF(2). This patch implements it, but depends on #3324 and an updated M4RI.

Attachments

m4ri_hash.patch Download (2.1 KB) - added by malb 20 months ago.
implements faster hashing
m4ri_hash2.patch Download (1.0 KB) - added by malb 19 months ago.
address review

Change History

Changed 20 months ago by malb

implements faster hashing

  Changed 20 months ago by malb

  • summary changed from [with patch, depends on other ticket] faster hashs for Matrix_mod2_dense to [with patch, needs review] faster hashs for Matrix_mod2_dense

it seems, the patch is independent of the PNG fix.

  Changed 19 months ago by malb

Simon, can you review my patch?

follow-up: ↓ 6   Changed 19 months ago by SimonKing

  • summary changed from [with patch, needs review] faster hashs for Matrix_mod2_dense to [with patch, with positive review] faster hashs for Matrix_mod2_dense

The patch applies to sage-3.1.1

Consider a little test:

sage: M=MatrixSpace(GF(2),10000,10000).random_element()
sage: M.set_immutable()
sage: time M.__hash__()

Without the patch, i had to interrupt sage after few minutes since it ate pretty much of my computer's memory.

With the patch, we get

sage: time M.__hash__()
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s
-2570088162955604276

Well done, Martin! I give a positive review.

The patch contains a doc-test. One problem for me: Typing M.__hash__?, i don't see them. What is the user supposed to do in order to see the doc-string of the respective hash method?

I know that the following should be another ticket. But here are two items on my wish list:

  1. Can similar things be done with the hash of matrices over GF(p), p>2?
  2. Please also improve pickling of matrices! Example:
    sage: M=MatrixSpace(GF(2),1000,1000).random_element()
    sage: timeit('x=loads(dumps(M))')
    5 loops, best of 3: 2.33 s per loop
    
    which is somehow slowish.

  Changed 19 months ago by mabshoff

  • summary changed from [with patch, with positive review] faster hashs for Matrix_mod2_dense to [with patch, positive review] faster hashs for Matrix_mod2_dense

Simon,

please open a thread on sage-devel. I assume the discussion will conclude that speed up is warranted and then we can open tickets for the new issues.

Cheers,

Michael

  Changed 19 months ago by mabshoff

  • status changed from new to closed
  • resolution set to fixed

Merged in Sage 3.1.2.alpha1

in reply to: ↑ 3   Changed 19 months ago by malb

Replying to SimonKing:

The patch contains a doc-test. One problem for me: Typing M.__hash__?, i don't see them. What is the user supposed to do in order to see the doc-string of the respective hash method?

That could be a bug for introspection (or however that thingy is called). Could you open a ticket?

1. Can similar things be done with the hash of matrices over GF(p), p>2?

Yes it can, please open a Trac ticket and I'll do it as soon as I find some time.

2. Please also improve pickling of matrices! Example: {{{ sage: M=MatrixSpace?(GF(2),1000,1000).random_element() sage: timeit('x=loads(dumps(M))') 5 loops, best of 3: 2.33 s per loop }}} which is somehow slowish.

That is #3324 which is blocked by a problem on OSX 10.4 and libpng.

  Changed 19 months ago by SimonKing

  • status changed from closed to reopened
  • resolution fixed deleted
  • summary changed from [with patch, positive review] faster hashs for Matrix_mod2_dense to [with patch, positive and negative review] faster hashs for Matrix_mod2_dense

Sorry, but i think i should re-open the ticket:

sage: M = MatrixSpace(GF(2),10000,10000).random_element()
sage: M.is_mutable()
True
sage: M.__hash__()
354973654957594997

A mutable object is not allowed to have a hash, AFAIK. On the other hand, the hash-value seems not to be cached:

sage: M[0,0]
1
sage: M[0,0]=0
sage: M.__hash__()
-8868398381897180811

So, the hash value has changed (i.e., was re-computed) by changing the matrix...

sage: N=copy(M)
sage: N.__hash__()
-8868398381897180811

... and has not changed by copying the matrix.

By consequence, it may be that everything is alright.

However, I re-open the ticket, because I think this should be addressed -- either by a new patch raising an exception when M is mutable, or by the assertion that "mutable objects have no hash" is a rule but no law, and that it is fine since the hash is not cached.

Cheers

Simon

Changed 19 months ago by malb

address review

  Changed 19 months ago by malb

You're right, good catch. The attached patch addresses that issue.

  Changed 19 months ago by mabshoff

  • summary changed from [with patch, positive and negative review] faster hashs for Matrix_mod2_dense to [with patch, positive] faster hashs for Matrix_mod2_dense

m4ri_hash2.patch looks good to me. Positive review.

  Changed 19 months ago by mabshoff

  • status changed from reopened to closed
  • resolution set to fixed

Merged m4ri_hash2.patch in Sage 3.1.2.alpha1

  Changed 19 months ago by malb

  • status changed from closed to reopened
  • resolution fixed deleted
  • summary changed from [with patch, positive] faster hashs for Matrix_mod2_dense to faster hashs for Matrix_mod2_dense

The new hash-ing method does not obey Sage's hashing rules. See Robert's comment at #3956. Obeying this rule however would come with a considerable speed penalty compared to m4ri_hash.patch.

  Changed 19 months ago by mabshoff

Martin,

can we move the latest issue to a new ticket? As is this ticket is getting rather messy, i.e. in HISTORY.txt as well as here.

Cheers,

Michael

  Changed 19 months ago by malb

  • status changed from reopened to closed
  • resolution set to fixed

your wish is my command.

Note: See TracTickets for help on using tickets.