Opened 4 years ago

Last modified 4 years ago

#25946 closed defect

py3: fixes for sage.schemes.hyperelliptic_curves — at Version 20

Reported by: embray Owned by:
Priority: major Milestone: sage-8.4
Component: python3 Keywords: sagedays@icerm
Cc: bhutz, cremona, roed Merged in:
Authors: Erik Bray Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: public/25946 (Commits, GitHub, GitLab) Commit: 78ae9220be1f5279f1766ec78f597bb4634b69e2
Dependencies: Stopgaps:

Status badges

Description (last modified by chapoton)

The main thing needed here was a __hash__. Or removing __eq__ and __ne__

Part of #24551

Change History (20)

comment:1 Changed 4 years ago by embray

  • Status changed from new to needs_review

comment:2 Changed 4 years ago by tscrim

  • Keywords sagedays@icerm added
  • Reviewers set to Travis Scrimshaw
  • Status changed from needs_review to needs_work

You have too much data in your __hash__:

sage: P.<x> = QQ[]
sage: f0 = 4*x^5 - 30*x^3 + 45*x - 22
sage: C0 = HyperellipticCurve(f0)
sage: P.<x> = RR[]
sage: f1 = 4*x^5 - 30*x^3 + 45*x - 22
sage: C1 = HyperellipticCurve(f1)
sage: C1 == C0
True
sage: hash(C1) == hash(C0)
False

So you definitely have to remove _PP from the hash, and I would probably remove both the __class__ as subclasses are not used in the __eq__ comparison.

comment:3 Changed 4 years ago by embray

I'm not 100% sure about that. I'd need to double-check what the intention is here. On Python 2 it was inheriting the flimsy default __hash__ from CategoryObject that just hashes its __repr__. So for this case I was including essentially the same data in the hash that would be needed to differentiate the reprs of two of these objects.

But maybe that's not necessary. It just wasn't clear what the intent was (if any) due to the default hash...

comment:4 Changed 4 years ago by embray

  • Status changed from needs_work to needs_info

comment:5 Changed 4 years ago by tscrim

I completely agree the repr hash is bad. I think the data used for equality should be what is used for the __hash__, not more. From what you're saying, it seems like we are actually fixing a bug if we only hash self._hyperelliptic_polynomials.

My understanding behind the default hash is that every object has a repr and generally two objects that compare equal have equal repr output.

I don't use/know this code, so we probably have to ask someone who understands it better to confirm if you're worried enough.

comment:6 Changed 4 years ago by embray

I could certainly try a simpler repr to start with and see what happens. If that works, then I agree with you we should just use what __eq__ compares.

comment:7 Changed 4 years ago by tscrim

Sounds good. Thanks.

comment:8 Changed 4 years ago by chapoton

any progress here ?

comment:9 Changed 4 years ago by embray

I mean, clearly not. I'd prefer if someone who knew this code were to chime in, but otherwise I'll get to it if/when I feel like taking the time to understand this code better.

comment:10 Changed 4 years ago by embray

To be clear, I also agree with everything Travis wrote on this ticket; I'm just not in a rush to act on it because I don't feel qualified to make that judgment call w.r.t. this code.

comment:11 Changed 4 years ago by tscrim

  • Cc bhutz added

Ben, do you know about this code or know who we should cc?

comment:12 Changed 4 years ago by bhutz

  • Cc cremona added

I'm not familiar with this code. John may know.

comment:13 Changed 4 years ago by cremona

I don't know or use the code (except perhaps occasionally) and so I do not really understand the issue here. I don't think that I have ever written such a has function so do not know what properties it is supposed to have.

What exactly was the problem which this patch is intended to fix?

comment:14 follow-up: Changed 4 years ago by alexjbest

I'm really confused about whether or not we should call these hyperelliptic curves equal in the first place. The current code returns true because the polynomials are equal which is because both variables are named x even though they are over different fields, are two curves over different fields really _equal_, even if one is just a base change of the other? For contrast

sage: P.<x,y>= RR[]
sage: C0 = Curve(y - x^2  - 1)
sage: C0
Affine Plane Curve over Real Field with 53 bits of precision defined by -x^2 + y - 1.00000000000000
sage: P.<x,y>= QQ[]
sage: C1 = Curve(y - x^2  - 1)
sage: C1
Affine Plane Curve over Rational Field defined by -x^2 + y - 1
sage: C0 == C1
False

comment:15 in reply to: ↑ 14 Changed 4 years ago by embray

Replying to alexjbest:

I'm really confused about whether or not we should call these hyperelliptic curves equal in the first place. The current code returns true because the polynomials are equal which is because both variables are named x even though they are over different fields, are two curves over different fields really _equal_, even if one is just a base change of the other?

I had the same doubt before. There's some inconsistency throughout Sage about when some objects defined over different fields are considered equal under ==. In some cases it's quite deliberate, in other cases it's not clear until and unless the original author chimes in. If the author isn't even sure I would lean against calling them ==, much less having the same hash.

comment:16 Changed 4 years ago by cremona

I agree with Alex: to be equal the fields of definition should equal and the defining polynomials identical. This is the case with elliptic curves:

sage: E = EllipticCurve([0,1])
sage: E
Elliptic Curve defined by y^2 = x^3 + 1 over Rational Field
sage: K.<i> = NumberField(x^2+1)
sage: EK = E.change_ring(K)
sage: EK
Elliptic Curve defined by y^2 = x^3 + 1 over Number Field in i with defining polynomial x^2 + 1
sage: E == EK
False
sage: hash(E)
8741953973726
sage: hash(EK)
8741951786400

I don't know what this hash function actually does: it calls hash_by_id, whatever that is. I also don't know how to find out what function is being called by E==EK either!

comment:17 Changed 4 years ago by tscrim

EllipticCurve is a UniqueFactory, so the input (i.e., in part, the field) uniquely identifies the curve. So the hash and == is obtained by using id as there is a unique such object in memory at a given time. Changing the hyperelliptic to suppose the UniqueRepresentation-type behavior would be a very invasive and complicated surgery I think.

So the short-term fix IMO would be deciding an appropriate notion of __eq__, and then changing the __hash__ to match. From what John is saying, we should also introduce a check that the defining fields are equal.

Also, from git blame, "Nick Alexander" and "tornaria" comes up a lot and David Kohel is listed in the copyright.

comment:18 Changed 4 years ago by chapoton

  • Branch changed from u/embray/python3/sage-schemes-hyperelliptic_curves/misc to public/25946
  • Commit changed from cb04ceb814b625383bce962fed09e39e7fa04184 to 78ae9220be1f5279f1766ec78f597bb4634b69e2
  • Status changed from needs_info to needs_review

I propose a solution : simply remove __eq__, __ne__ and __hash__ here.

Then they are all provided by the general setting of projective subschemes, which seems to be a good idea. This will compare ambient spaces and defining ideals, which amount to compare base field and polynomials.


New commits:

2726198Merge branch 'u/embray/python3/sage-schemes-hyperelliptic_curves/misc' of ssh://trac.sagemath.org:22/sage into 8.4.b0
78ae922py3: comparison and hash cleanup for hyperellliptic curves

comment:19 Changed 4 years ago by cremona

That sounds very sensible to me -- I do not know why the person who implemented these special functions thought it was necessary to override those of the parent class.

As far as I am concerned this is good to merge assuming that tests all pass.

comment:20 Changed 4 years ago by chapoton

  • Description modified (diff)
Note: See TracTickets for help on using tickets.