Opened 12 years ago

Closed 6 years ago

#10130 closed enhancement (wontfix)

Revamp __hash__, __cmp__ and __richcmp__

Reported by: David Roe Owned by: Robert Bradshaw
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: coercion Keywords:
Cc: Robert Bradshaw, Nicolas M. Thiéry Merged in:
Authors: Reviewers: Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by Jeroen Demeyer)

There are a number of confusing and non-optimal features of the way Sage currently handles hashing and comparison.

  • Because of Python, if you override one of these functions, you must redefine the others as well. This is easy to forget and confuses new users. For Cython classes, this can be fixed in #18329.
  • hashes of parents are used extensively in the coercion framework, so speed is quite important. But since parents are usually written in Python, the current model will always have at least a dictionary lookup (for example, in finding a cached self.__hash). And often parents don't override the default hashing code and they fall back to slow __repr__ methods. Fixed by #715???

Attachments (1)

10130_basic.patch (2.5 KB) - added by David Roe 12 years ago.
A few hash changes. WARNING: changes parent.pxd, so rebuilding will be slow.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 12 years ago by David Roe

I don't know the best way to resolve these issues, but here are some ideas.

  • Have cpdef'd functions _hash_, _cmp_ and _richcmp_ that users will override.
  • For parents, add an attribute "cdef public long __hash". I don't know if this should be set to self._hash_() during the parent's __init__ method, or set to -1 and then reset upon the first call to __hash__.
  • For SageObject, have the default behavior of _hash_() be to return hash(repr(self)). For parents, have the default value of hash depend on self.construction()?
  • For elements, don't cache the hash by default.
  • Have the results of ==, <, etc, depend on _richcmp_. By default, have _richcmp_ determined by a total ordering implemented by _cmp_. So if you want to have partial ordering, or a different partial and total ordering, you need to implement both _cmp_ and _richcmp_.
  • For parents, _cmp_ checks first for pointer equality. If non-equal, it compares the hashes. If those are nonequal, returns the cmp of the hashes. Otherwise, returns the cmp of the pointers.
  • Elements would have no default _cmp_, but would instead raise an error.

comment:2 Changed 12 years ago by David Roe

Also, in Python 2.6, you can set A.hash = None, to indicate that an object is not hashable. We should probably do this for the various immutable types in Sage, rather than raising a type error. I don't know if this works currently (I can't do it from the command line on a mutable matrix for example).

Changed 12 years ago by David Roe

Attachment: 10130_basic.patch added

A few hash changes. WARNING: changes parent.pxd, so rebuilding will be slow.

comment:3 Changed 12 years ago by David Roe

As an example of the speed gains possible, this patch resolves #9886.

comment:4 Changed 12 years ago by Nicolas M. Thiéry

Just a quick note: for parents that use UniqueRepresentation?, the hash is given by the id of the object; one can't be much faster (except that UniqueRepresentation? is not Cython'ed yet):

   F = CombinatorialFreeModule(QQ,QQ)
   sage: F.__hash__()
   98169392
   sage: id(F)
   98169392

Other than that, I am all for improvements in this direction, because at this stage I am still confused about what I should do when I want to implement a partially ordered set in Python. Besides it would be best if it was possible to implement such partial orders in categories (remember that element methods defined in categories are overridden by methods defined in concrete classes like Element).

comment:5 Changed 10 years ago by Travis Scrimshaw

Cc: Nicolas M. Thiéry added

Nicolas T. and I somewhat came across the second point in working on #13605. In particular, I have changed Partition to inherit from Element, use the rich comparisons, and removed the __cmp__() method. Thus calling things like

sage: sorted(Partitions(5), cmp)

raise the "BUG - comparsion not implemented" exception. This holds for any subclass of Element which does not implement a __cmp__() and does not fall back on the rich comparisons as regular python objects would.

comment:6 Changed 9 years ago by Jeroen Demeyer

Milestone: sage-5.11sage-5.12

comment:7 Changed 9 years ago by For batch modifications

Milestone: sage-6.1sage-6.2

comment:8 Changed 8 years ago by For batch modifications

Milestone: sage-6.2sage-6.3

comment:9 Changed 8 years ago by For batch modifications

Milestone: sage-6.3sage-6.4

comment:10 Changed 8 years ago by Marc Mezzarobba

Related: #17890.

comment:11 Changed 8 years ago by Jeroen Demeyer

The second item is fixed by #17890 and I guess the third is fixed #715. The first is pure Python behaviour, which I don't see an obvious solution for...

comment:12 Changed 8 years ago by Jeroen Demeyer

Description: modified (diff)

comment:13 Changed 7 years ago by Jeroen Demeyer

Description: modified (diff)

comment:14 Changed 6 years ago by Jeroen Demeyer

Milestone: sage-6.4sage-duplicate/invalid/wontfix
Reviewers: Jeroen Demeyer
Status: newneeds_review

This is a too general ticket. And most of this is actually fixed.

comment:15 Changed 6 years ago by Jeroen Demeyer

Status: needs_reviewpositive_review

comment:16 Changed 6 years ago by Erik Bray

Resolution: wontfix
Status: positive_reviewclosed

Determined to be invalid/duplicate/wontfix (closing as "wontfix" as a catch-all resolution).

Note: See TracTickets for help on using tickets.