Opened 17 months ago
Closed 16 months ago
#25946 closed defect (fixed)
py3: fixes for sage.schemes.hyperelliptic_curves
Reported by:  embray  Owned by:  

Priority:  major  Milestone:  sage8.4 
Component:  python3  Keywords:  sagedays@icerm 
Cc:  bhutz, cremona, roed  Merged in:  
Authors:  Erik Bray  Reviewers:  Travis Scrimshaw, David Roe 
Report Upstream:  N/A  Work issues:  
Branch:  f8687f7 (Commits)  Commit:  f8687f7d8800aede97e36d4da30ae79da895b5f7 
Dependencies:  Stopgaps: 
Description (last modified by )
The main thing needed here was a __hash__
.
Or removing __eq__
and __ne__
Part of #24551
Change History (51)
comment:1 Changed 17 months ago by
 Status changed from new to needs_review
comment:2 Changed 17 months ago by
 Keywords sagedays@icerm added
 Reviewers set to Travis Scrimshaw
 Status changed from needs_review to needs_work
comment:3 Changed 17 months ago by
I'm not 100% sure about that. I'd need to doublecheck what the intention is here. On Python 2 it was inheriting the flimsy default __hash__
from CategoryObject
that just hashes its __repr__
. So for this case I was including essentially the same data in the hash that would be needed to differentiate the reprs of two of these objects.
But maybe that's not necessary. It just wasn't clear what the intent was (if any) due to the default hash...
comment:4 Changed 17 months ago by
 Status changed from needs_work to needs_info
comment:5 Changed 17 months ago by
I completely agree the repr
hash is bad. I think the data used for equality should be what is used for the __hash__
, not more. From what you're saying, it seems like we are actually fixing a bug if we only hash self._hyperelliptic_polynomials
.
My understanding behind the default hash is that every object has a repr
and generally two objects that compare equal have equal repr
output.
I don't use/know this code, so we probably have to ask someone who understands it better to confirm if you're worried enough.
comment:6 Changed 17 months ago by
I could certainly try a simpler repr to start with and see what happens. If that works, then I agree with you we should just use what __eq__
compares.
comment:7 Changed 17 months ago by
Sounds good. Thanks.
comment:8 Changed 16 months ago by
any progress here ?
comment:9 Changed 16 months ago by
I mean, clearly not. I'd prefer if someone who knew this code were to chime in, but otherwise I'll get to it if/when I feel like taking the time to understand this code better.
comment:10 Changed 16 months ago by
To be clear, I also agree with everything Travis wrote on this ticket; I'm just not in a rush to act on it because I don't feel qualified to make that judgment call w.r.t. this code.
comment:11 Changed 16 months ago by
 Cc bhutz added
Ben, do you know about this code or know who we should cc?
comment:12 Changed 16 months ago by
 Cc cremona added
I'm not familiar with this code. John may know.
comment:13 Changed 16 months ago by
I don't know or use the code (except perhaps occasionally) and so I do not really understand the issue here. I don't think that I have ever written such a has function so do not know what properties it is supposed to have.
What exactly was the problem which this patch is intended to fix?
comment:14 followup: ↓ 15 Changed 16 months ago by
I'm really confused about whether or not we should call these hyperelliptic curves equal in the first place. The current code returns true because the polynomials are equal which is because both variables are named x
even though they are over different fields, are two curves over different fields really _equal_, even if one is just a base change of the other? For contrast
sage: P.<x,y>= RR[] sage: C0 = Curve(y  x^2  1) sage: C0 Affine Plane Curve over Real Field with 53 bits of precision defined by x^2 + y  1.00000000000000 sage: P.<x,y>= QQ[] sage: C1 = Curve(y  x^2  1) sage: C1 Affine Plane Curve over Rational Field defined by x^2 + y  1 sage: C0 == C1 False
comment:15 in reply to: ↑ 14 Changed 16 months ago by
Replying to alexjbest:
I'm really confused about whether or not we should call these hyperelliptic curves equal in the first place. The current code returns true because the polynomials are equal which is because both variables are named
x
even though they are over different fields, are two curves over different fields really _equal_, even if one is just a base change of the other?
I had the same doubt before. There's some inconsistency throughout Sage about when some objects defined over different fields are considered equal under ==
. In some cases it's quite deliberate, in other cases it's not clear until and unless the original author chimes in. If the author isn't even sure I would lean against calling them ==
, much less having the same hash.
comment:16 Changed 16 months ago by
I agree with Alex: to be equal the fields of definition should equal and the defining polynomials identical. This is the case with elliptic curves:
sage: E = EllipticCurve([0,1]) sage: E Elliptic Curve defined by y^2 = x^3 + 1 over Rational Field sage: K.<i> = NumberField(x^2+1) sage: EK = E.change_ring(K) sage: EK Elliptic Curve defined by y^2 = x^3 + 1 over Number Field in i with defining polynomial x^2 + 1 sage: E == EK False sage: hash(E) 8741953973726 sage: hash(EK) 8741951786400
I don't know what this hash function actually does: it calls hash_by_id, whatever that is. I also don't know how to find out what function is being called by E==EK either!
comment:17 Changed 16 months ago by
EllipticCurve
is a UniqueFactory
, so the input (i.e., in part, the field) uniquely identifies the curve. So the hash
and ==
is obtained by using id
as there is a unique such object in memory at a given time. Changing the hyperelliptic to suppose the UniqueRepresentation
type behavior would be a very invasive and complicated surgery I think.
So the shortterm fix IMO would be deciding an appropriate notion of __eq__
, and then changing the __hash__
to match. From what John is saying, we should also introduce a check that the defining fields are equal.
Also, from git blame
, "Nick Alexander" and "tornaria" comes up a lot and David Kohel is listed in the copyright.
comment:18 Changed 16 months ago by
 Branch changed from u/embray/python3/sageschemeshyperelliptic_curves/misc to public/25946
 Commit changed from cb04ceb814b625383bce962fed09e39e7fa04184 to 78ae9220be1f5279f1766ec78f597bb4634b69e2
 Status changed from needs_info to needs_review
I propose a solution : simply remove __eq__
, __ne__
and __hash__
here.
Then they are all provided by the general setting of projective subschemes, which seems to be a good idea. This will compare ambient spaces and defining ideals, which amount to compare base field and polynomials.
New commits:
2726198  Merge branch 'u/embray/python3/sageschemeshyperelliptic_curves/misc' of ssh://trac.sagemath.org:22/sage into 8.4.b0

78ae922  py3: comparison and hash cleanup for hyperellliptic curves

comment:19 Changed 16 months ago by
That sounds very sensible to me  I do not know why the person who implemented these special functions thought it was necessary to override those of the parent class.
As far as I am concerned this is good to merge assuming that tests all pass.
comment:20 Changed 16 months ago by
 Description modified (diff)
comment:21 Changed 16 months ago by
failing doctests in hyperelliptic_padic_field.py (sigh)
comment:22 Changed 16 months ago by
This seems like a reasonable approach, but it seems that you exposed a poor assumption that was made somewhere else. I don't know what a sage.rings.padics.padic_ZZ_pX_CR_element.pAdicZZpXCRElement
is, or why it isn't hashable. Perhaps it should be?
If it definitely shouldn't be hashable (e.g. it is not immutable) then maybe the problem is in MPolynomialIdeal.__richcmp__
's assumption that the generators of an ideal are definitely hashable (and if not, it should do something else to compare gens).
comment:23 Changed 16 months ago by
There is a _cache_key
method on these objects inside src/sage/rings/padics/padic_ZZ_pX_CR_element.pyx
But its doc claims that there is no reasonable hash..
comment:24 Changed 16 months ago by
 Cc roed added
Adding David Roe in CC since he may be able to throw light on the padic ring issue. Note that padic rings (as with the reals) are not exact, so comparison between elements is not so easy.
comment:25 Changed 16 months ago by
Incidentally, MPolynomialIdeal.__richcmp__
was last touched by #23920. The idea to use sets to compare gens was from Travis: https://trac.sagemath.org/ticket/23920#comment:9
Perhaps there should be a try/except around this and attempt direct comparison if comparing by sets raises a TypeError
. I'm not sure what the impact is of computing a Groebner basis, or if it can be avoided in some other way as well.
comment:26 Changed 16 months ago by
 Commit changed from 78ae9220be1f5279f1766ec78f597bb4634b69e2 to 243172790b441f1feb1f9dfa3cfdff9810f6705d
Branch pushed to git repo; I updated commit sha1. New commits:
2431727  adding the hash

comment:27 followup: ↓ 28 Changed 16 months ago by
adding hash from the existing _cache_key seems to fix the failing doctests.. But is this a correct thing to do ?
comment:28 in reply to: ↑ 27 Changed 16 months ago by
Replying to chapoton:
adding hash from the existing _cache_key seems to fix the failing doctests.. But is this a correct thing to do ?
No, I don't believe so, and the I think the explanation in the _cache_key
docs is fairly clear why. Instead: https://trac.sagemath.org/ticket/25946#comment:25
comment:29 Changed 16 months ago by
 Commit changed from 243172790b441f1feb1f9dfa3cfdff9810f6705d to 9f9399ca5bdf787d8cb61f1174f71822b1ea241d
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
d5573ef  py3: add a __hash__ for HyperellipticCurve_generic

f29c001  py3: need explicit cast to int for range

3771812  py3: comparison and hash cleanup for hyperellliptic curves

9f9399c  tiny change in the comparison of ideals inside polynomial rings

comment:30 Changed 16 months ago by
ok. So here is my next proposal : when comparing ideals generated by a single polynomial, just compare this generator (no need to wrap by set).
comment:31 followup: ↓ 33 Changed 16 months ago by
True. Is that case common enough to merit a special case? Are there cases of ideals of rings over these types of padics whose ideal would be generated by more than one polynomials (and hence would still crash on the existing code)?
What I'm saying is just, why not:
try: same_gens = set(self.gens()) == set(other.gens()) except TypeError: same_gens = False if same_gens: # Do whatever we do currently else: # Do whatever is necessary to compare the gens (construct a GB, etc).
comment:32 Changed 16 months ago by
 Commit changed from 9f9399ca5bdf787d8cb61f1174f71822b1ea241d to ecc4b361b04690708a037dc03c0975a5d312530f
Branch pushed to git repo; I updated commit sha1. New commits:
ecc4b36  change again the comparison of ideals

comment:33 in reply to: ↑ 31 Changed 16 months ago by
Replying to embray:
True. Is that case common enough to merit a special case? Are there cases of ideals of rings over these types of padics whose ideal would be generated by more than one polynomials (and hence would still crash on the existing code)?
Certainly: in ZZ_p[X] the ideal generated by p and X. But the number of gens will always be small, so why not sort the gens, however many there are?
This is an example of a common problem in Sage and other systems. If I define two ideals (or other structures) in different ways and ask whether they are equal then the test may be very expensive. But here we just test whether they have been presented in the same way (or with trivial changes such as changing the order of the generators).
What I'm saying is just, why not:
try: same_gens = set(self.gens()) == set(other.gens()) except TypeError: same_gens = False if same_gens: # Do whatever we do currently else: # Do whatever is necessary to compare the gens (construct a GB, etc).
New commits:
ecc4b36  change again the comparison of ideals

comment:34 followup: ↓ 35 Changed 16 months ago by
Here is another tentative : first compare without set() then compare with set()
comment:35 in reply to: ↑ 34 Changed 16 months ago by
Replying to chapoton:
Here is another tentative : first compare without set() then compare with set()
That could still crash. I don't know the math well enough to construct an interestingexample case, but say you have two copies of the same ideal, but their generators just happen to be in a different order, as in:
sage: R.<x,y> = QQ[] sage: R.ideal(x, y).gens() [x, y] sage: R.ideal(y, x).gens()
This is the sort of case, as John wrote, that the set()
call is intended for in the first place. But if the elements x
and y
are not hashable, then the first test will still fail, and then the second test will be evaluated and will crash.
IMO this comparison of gens() is just a special case shortcut which is fine asis, but it should account for the possibility that some elements just aren't hashable, that's all.
comment:36 followup: ↓ 39 Changed 16 months ago by
Does hashable imply not sortable? As I said, the lists of gens will be short normally, so something like testing sorted(s_gens)==sorted(o_gens) might work when the set() constructor does not? There will still be silly trivial cases such as first giving {x,y} as gens then {x,x,y} and for sure someone will then complain. But even as it is, we are not returning equality of the ideals (x,y) and (x+y,y). We are never going to catch all trivial cases (and different users' idea of a trivial case will differ) so just comparing the list of gens with no processing at all is safest and most easily explained.
comment:37 followup: ↓ 40 Changed 16 months ago by
 Commit changed from ecc4b361b04690708a037dc03c0975a5d312530f to d47fbb3f350c6e5ce4d2fdc809a3849699bebdd0
Branch pushed to git repo; I updated commit sha1. New commits:
d47fbb3  wrap with try

comment:38 Changed 16 months ago by
ok, I have wrapped with "try / except".
comment:39 in reply to: ↑ 36 Changed 16 months ago by
Replying to cremona:
Does hashable imply not sortable? As I said, the lists of gens will be short normally, so something like testing sorted(s_gens)==sorted(o_gens) might work when the set() constructor does not? There will still be silly trivial cases such as first giving {x,y} as gens then {x,x,y} and for sure someone will then complain. But even as it is, we are not returning equality of the ideals (x,y) and (x+y,y). We are never going to catch all trivial cases (and different users' idea of a trivial case will differ) so just comparing the list of gens with no processing at all is safest and most easily explained.
I don't think sorting really helps here, because as I found out over in #25948, some elements can be "sorted" but the ordering is meaningless and unpredictable.
comment:40 in reply to: ↑ 37 ; followup: ↓ 41 Changed 16 months ago by
comment:41 in reply to: ↑ 40 Changed 16 months ago by
Replying to embray:
Replying to git:
Branch pushed to git repo; I updated commit sha1. New commits:
d47fbb3 wrap with try
You could still get rid of
(s_gens == o_gens)
. I don't think it's that useful.
Yes it is. It will handle the precise case that we are dealing with in this ticket, hyperelliptic curves over padics.
comment:42 Changed 16 months ago by
Ok, but I think as written you don't need the same_gens
variable. I only wrote it that way in my pseudocode because I wasn't directly looking at the real code at the time, and didn't consider using a try/except/else. I don't think we need to worry about rich_to_bool
raising a TypeError
.
comment:43 Changed 16 months ago by
 Commit changed from d47fbb3f350c6e5ce4d2fdc809a3849699bebdd0 to b129fc96811a9db8c454a395180e6ca2a8a1b5fb
Branch pushed to git repo; I updated commit sha1. New commits:
b129fc9  fix

comment:44 Changed 16 months ago by
comment:45 Changed 16 months ago by
Yeah, though it would be nice to squash all these commits before we're done here.
comment:46 Changed 16 months ago by
 Commit changed from b129fc96811a9db8c454a395180e6ca2a8a1b5fb to f8687f7d8800aede97e36d4da30ae79da895b5f7
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
f8687f7  py3: comparison cleanup for hyperelliptic curves

comment:47 Changed 16 months ago by
done
comment:48 Changed 16 months ago by
Thanks! Well, I'm happy, so you can set positive review if you want, or wait and see if anyone smarter than me wants to weigh on, though it seems like this approach meshes with what the other experts have said so far.
comment:49 Changed 16 months ago by
I am happy (and making no claims to be cleverer than anyone else ;))
I would never have thought of using rich_to_bool() though. Not clever enough!
comment:50 Changed 16 months ago by
 Reviewers changed from Travis Scrimshaw to Travis Scrimshaw, David Roe
 Status changed from needs_review to positive_review
Seems good to me too!
comment:51 Changed 16 months ago by
 Branch changed from public/25946 to f8687f7d8800aede97e36d4da30ae79da895b5f7
 Resolution set to fixed
 Status changed from positive_review to closed
You have too much data in your
__hash__
:So you definitely have to remove
_PP
from the hash, and I would probably remove both the__class__
as subclasses are not used in the__eq__
comparison.