Opened 4 years ago
Closed 4 years ago
#24501 closed defect (fixed)
Hash of Integer broken on Python 3
Reported by:  embray  Owned by:  

Priority:  major  Milestone:  sage8.2 
Component:  python3  Keywords:  
Cc:  Merged in:  
Authors:  Jeroen Demeyer  Reviewers:  Erik Bray 
Report Upstream:  N/A  Work issues:  
Branch:  248cdd8 (Commits, GitHub, GitLab)  Commit:  248cdd8d0c021056c33c62814cbd04e382ec8f3e 
Dependencies:  Stopgaps: 
Description
Currently Integer.__hash__
is implemented with mpz_pythonhash
, which is designed to return the same hash value for builtin int
and long
s.
In Python 3 something changed subtly about the hash algorithm such that this no longer holds:
sage: hash(int(2^61)) 1 sage: hash(2^61) 2305843009213693952
I haven't looked into what the difference is yet. But I suspect this is the source of a number of errors I've seen.
Change History (11)
comment:1 Changed 4 years ago by
comment:2 Changed 4 years ago by
I seeessentially instead of reduction modulo ULONG_MAX
it's modulo a prime _PyHASH_MODULUS
.
comment:4 Changed 4 years ago by
 Branch set to u/jdemeyer/hash_of_integer_broken_on_python_3
comment:5 Changed 4 years ago by
 Commit set to 19efd2db979069166ced3cf57c497d2fdd07858f
 Status changed from new to needs_review
New commits:
19efd2d  Fix hashing of Integer on Python 3

comment:6 followup: ↓ 7 Changed 4 years ago by
Ah, you beat me to it. I was doing roughly the same thing but with separate Python 2 and 3 branches, so your version is more "generic" I suppose.
I wonder if we an find some way to get away with not having to calculate modulus
and limb_bits
every time since they are constant, but it's just some bit shifts and maybe the compiler can optimize that out anyways.
comment:7 in reply to: ↑ 6 Changed 4 years ago by
Replying to embray:
I wonder if we an find some way to get away with not having to calculate
modulus
andlimb_bits
every time since they are constant, but it's just some bit shifts and maybe the compiler can optimize that out anyways.
Constant folding is so trivial that it is safe to assume that the compiler knows what to do.
comment:8 Changed 4 years ago by
It seems like you're defining limb_bits
twiceonce using the trick in cdef extern
and then again in the function body.
comment:9 Changed 4 years ago by
 Commit changed from 19efd2db979069166ced3cf57c497d2fdd07858f to 248cdd8d0c021056c33c62814cbd04e382ec8f3e
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
248cdd8  Fix hashing of Integer on Python 3

comment:10 Changed 4 years ago by
 Reviewers set to Erik Bray
 Status changed from needs_review to positive_review
Thanks for the quick work on thislooks good to me!
comment:11 Changed 4 years ago by
 Branch changed from u/jdemeyer/hash_of_integer_broken_on_python_3 to 248cdd8d0c021056c33c62814cbd04e382ec8f3e
 Resolution set to fixed
 Status changed from positive_review to closed
Relevant: https://bugs.python.org/issue8188