Opened 5 years ago
Closed 3 years ago
#21316 closed defect (duplicate)
fix hash of (dense) matrices
Reported by:  dkrenn  Owned by:  

Priority:  major  Milestone:  sageduplicate/invalid/wontfix 
Component:  misc  Keywords:  
Cc:  Merged in:  
Authors:  Reviewers:  
Report Upstream:  N/A  Work issues:  adapt doctests 
Branch:  u/dkrenn/hashmatrix (Commits, GitHub, GitLab)  Commit:  1e1263f56137d00f56eb828a7cf592b4f9649d2f 
Dependencies:  Stopgaps: 
Description
It took me only 4 guesses to find two matrices with the same hash:
sage: N = Matrix([[1,1],[2,2]]); N.set_immutable(); hash(N) 3 sage: N = Matrix([[0,0],[0,1]]); N.set_immutable(); hash(N) 3
This should not be that easy.
In particular (looking at the code in sage.matrix.matric_dense
), the hash value does not depend on entry 1,1 of the matrix (multiplication by zero in the hash code):
sage: N = Matrix([[3,2],[0,0]]); N.set_immutable(); hash(N) 2 sage: N = Matrix([[443,2],[0,0]]); N.set_immutable(); hash(N) 2 sage: N = Matrix([[2,2],[0,0]]); N.set_immutable(); hash(N) 2
Change History (7)
comment:1 Changed 5 years ago by
comment:2 Changed 4 years ago by
Noncryptographic hashes don't have to be resilient to collision attacks, but should indeed give good hash value spread for the input space  which this implementation doesn't.
I can confirm that this implementation causes massive performance degradation when matrices are used as dictionary keys, for instance.
See tuple hash algorithm and explanations why XORing hash values doesn't perform very well.
While the current implementation does try to 'save' itself from the disadvantages of XORing hash values by multiplying by the index, it still performs very poorly.
I tried the solution dkrenn suggested and it does indeed work much better. If we don't care about the performance cost of creating a tuple (and a list), seeing as the hash value is cached, it's a nice solution.
comment:3 Changed 4 years ago by
 Milestone changed from sage7.4 to sage8.1
comment:4 Changed 4 years ago by
 Branch set to u/dkrenn/hashmatrix
comment:5 Changed 4 years ago by
 Commit set to 1e1263f56137d00f56eb828a7cf592b4f9649d2f
 Work issues set to adapt doctests
New commits:
1e1263f  new hash function for matrix_dense

comment:6 Changed 3 years ago by
Looks like major work was put into this issue in #19050, rendering this ticket a duplicate.
comment:7 Changed 3 years ago by
 Milestone changed from sage8.1 to sageduplicate/invalid/wontfix
 Resolution set to duplicate
 Status changed from new to closed
Duplicate of #10950.
From
sage.matrix.matrix_dense
:What is a good caching strategy? Simply
h = hash(tuple(self._list()))
?