Opened 3 years ago

Closed 3 years ago

#27344 closed enhancement (fixed)

py3: fix some doctests about hash for real and complex double

Reported by: chapoton Owned by:
Priority: major Milestone: sage-8.7
Component: python3 Keywords:
Cc: Merged in:
Authors: Frédéric Chapoton Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: b7fd196 (Commits, GitHub, GitLab) Commit: b7fd19663e0d51569a020059044291394717b40c
Dependencies: Stopgaps:

Status badges

Description


Change History (16)

comment:1 Changed 3 years ago by chapoton

  • Branch set to u/chapoton/27344
  • Commit set to f625fd11e80ec57ba521e4b04631672d52abbbf4
  • Status changed from new to needs_review

New commits:

f625fd1py3: fix some doctests about hash for complex and real double

comment:2 Changed 3 years ago by jdemeyer

  • Reviewers set to Jeroen Demeyer
  • Status changed from needs_review to positive_review

comment:3 Changed 3 years ago by chapoton

Thanks a lot, Jeroen. By the way, would you please give your opinion on #27065 ?

comment:4 Changed 3 years ago by embray

  • Status changed from positive_review to needs_work

I don't understand these tests. They seem kind of pointless? If we must have doctests for __hash__ methods can they at least prove that they maintain whatever invariant the hash is meant to maintain (which in this case is trivial, but still more meaningful than testing that hash(X) doesn't crash).

comment:5 Changed 3 years ago by chapoton

Ahem.. <grumbl>Time lost will never come back.</grumbl>

So, Erik, please tell us exactly what you would like to see as doctests here. Or better, just commit something that suits you and pass on both py2 and py3.

comment:6 Changed 3 years ago by embray

I'm sorry, I'd be frustrated too, but in the same vein I don't think we should waste time writing pointless doctests that prove nothing.

I personally have no idea what the intent is here for these classes implementing __hash__ or if there's even any reason for them to do so at all. Perhaps we should find that out.

comment:7 Changed 3 years ago by embray

Just looking at the code I have no reason--it's not documented anywhere--why RDF has to be hashable at all. I'm sure there's a reason, but it would be nice if that were actually explained somewhere. Furthermore I have no idea, even if the class is intended to be used as a singleton, why its hash it set to a (seemingly) arbitrary integer, nor why the old test expected that value to be equivalent to hash(str(RDF)) modulo 2^32. I'd want to know that before just kind of sweeping things under the rug.

comment:8 follow-up: Changed 3 years ago by embray

The commit where the original test for RDF was written to its previous form (I'll look at CDF separately) just says "Clean up the fallout from the RDF/CDF hashing patch" which doesn't really answer anything.

In particular I don't get the "mod 2^32" part, although it does appear to hold accurate on Python 2. Though it seems part of the point is that hash(RDF) and hash(str(RDF)) are not the same:

sage: {RDF: 1, str(RDF): 2}
{Real Double Field: 1, 'Real Double Field': 2}

ISTM part of the point of giving it a fixed value was at some point primarily for some performance purpose, but it's not clear what that purpose was exactly or what benchmark this was targeting.

comment:9 follow-up: Changed 3 years ago by jdemeyer

It's clear to me that the only thing that matters is that hash(RDF) works and that it gives a constant (within one Python session at least) value. So there is nothing wrong with return 561162115 as implementation.

comment:10 in reply to: ↑ 8 Changed 3 years ago by embray

Replying to embray:

In particular I don't get the "mod 2^32" part

I think the reason for this, at least, was that on 32-bit Python the str hashing function overflows, whereas on 64-bit it isn't as likely to overflow for a short string, but either way between 32-bit and 64-bit the hash values wind up the same modulo 2!32.

That still raises the question of why there was a desire for hash(RDF) == hash(str(RDF)) especially since that does not hold on 64-bit.

comment:11 in reply to: ↑ 9 Changed 3 years ago by embray

Replying to jdemeyer:

It's clear to me that the only thing that matters is that hash(RDF) works and that it gives a constant (within one Python session at least) value. So there is nothing wrong with return 561162115 as implementation.

I think then it suffices to demonstrate that hash(RDF) == hash(RealDoubleField_class()) and to mention that this class is intended for use as a singleton so any instance of it should be equivalent from a hashing perspective.

comment:12 Changed 3 years ago by git

  • Commit changed from f625fd11e80ec57ba521e4b04631672d52abbbf4 to 9cc15134a4c35af3b5c61d1fb3e7cfff9f70442d

Branch pushed to git repo; I updated commit sha1. New commits:

9cc1513better hash tests

comment:13 Changed 3 years ago by git

  • Commit changed from 9cc15134a4c35af3b5c61d1fb3e7cfff9f70442d to b7fd19663e0d51569a020059044291394717b40c

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

b7fd196py3: fix some doctests about hash for complex and real double

comment:14 Changed 3 years ago by chapoton

  • Status changed from needs_work to needs_review

like that ?

comment:15 Changed 3 years ago by jdemeyer

  • Status changed from needs_review to positive_review

comment:16 Changed 3 years ago by vbraun

  • Branch changed from u/chapoton/27344 to b7fd19663e0d51569a020059044291394717b40c
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.