Opened 5 years ago
Closed 5 years ago
#23102 closed enhancement (fixed)
Support __richcmp__ in Python classes
Reported by: | jdemeyer | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.0 |
Component: | python3 | Keywords: | |
Cc: | chapoton, tscrim, embray | Merged in: | |
Authors: | Jeroen Demeyer | Reviewers: | Travis Scrimshaw, Erik Bray |
Report Upstream: | N/A | Work issues: | |
Branch: | 4d1021b (Commits, GitHub, GitLab) | Commit: | 4d1021b1e2ccd1e811549aeb1e451b9a08cc6251 |
Dependencies: | #23103 | Stopgaps: |
Description (last modified by )
Sage has a lot of functionality to deal with "rich comparisons" using __richcmp__
in Cython or _richcmp_
in Python or Cython for subclasses of Element
. However, we are still missing rich comparison support for Python classes which are not subclasses from Element
.
A good example of this is RealSet
in src/sage/sets/real_set.py
. Currently, comparison is implemented using __cmp__
. Changing this to Python 3 comparisons would normally require 6 methods (__eq__
, __ne__
, __lt__
, __le__
, __gt__
, __ge__
). It would be convenient to support this with just one __richcmp__
method like in Cython.
This ticket proposes:
- Add a decorator
richcmp_method
which adds support for__richcmp__
to a Python class.
- Implement
RealSet.__richcmp__
using this.
Attachments (1)
Change History (51)
comment:1 Changed 5 years ago by
- Description modified (diff)
comment:2 follow-up: ↓ 5 Changed 5 years ago by
comment:3 Changed 5 years ago by
- Cc tscrim added
comment:4 Changed 5 years ago by
- Branch set to u/jdemeyer/ticket/23102
comment:5 in reply to: ↑ 2 Changed 5 years ago by
- Commit set to 47f97ec88f373697faafeb84579becb064213dc1
Replying to tscrim:
What about using
@total_ordering
to reduce the number of rich comparison methods?
@total_ordering
reduces the number of rich comparison methods from 6 to 2. This ticket reduces it from 6 to 1.
@total_ordering
is less efficient (if you only have==
and<
, you might need two calls to determine<=
)
New commits:
256bf19 | Move richcmp stuff to new file richcmp.pyx
|
b591b78 | Support __richcmp__ in Python classes
|
47f97ec | Implement RealSet comparisons using __richcmp__
|
comment:6 Changed 5 years ago by
- Dependencies set to #23103
- Status changed from new to needs_review
comment:7 Changed 5 years ago by
- Description modified (diff)
comment:8 follow-up: ↓ 9 Changed 5 years ago by
Note that from #22828, there is
def richcmp_by_eq_and_lt(left, right, op):
in
src/sage/rings/asymptotic/misc.py
that should also be moved to the common new place.
comment:9 in reply to: ↑ 8 Changed 5 years ago by
comment:10 follow-up: ↓ 11 Changed 5 years ago by
Typo in richcmp_method
: mtehod
. Have you run some timings to compare how this works versus direct implementation versus @total_ordering
in Python classes? I can do this, but not for 2-3 days because I will be traveling.
comment:11 in reply to: ↑ 10 Changed 5 years ago by
Replying to tscrim:
Have you run some timings to compare how this works versus direct implementation versus
@total_ordering
in Python classes? I can do this, but not for 2-3 days because I will be traveling.
I just did that, see cmp-timing.py
First of all, for a plain equality check (@total_ordering
is not involved in this), the timing is almost the same for __eq__
and __richcmp__
, the latter being just slightly slower:
sage: load("cmp-timing.py") sage: timeit('A1 == A2', number=1000000, repeat=100) 1000000 loops, best of 100: 281 ns per loop sage: timeit('B1 == B2', number=1000000, repeat=100) 1000000 loops, best of 100: 293 ns per loop
Now, comparisons involving @total_ordering
take almost twice or three times as much time:
sage: load("cmp-timing.py") sage: timeit('A1 > A2', number=1000000, repeat=20) # calls A2.__lt__(A1) 1000000 loops, best of 20: 557 ns per loop sage: timeit('B1 > B2', number=1000000, repeat=20) 1000000 loops, best of 20: 292 ns per loop
sage: load("cmp-timing.py") sage: timeit('A2 <= A1', number=1000000, repeat=20) # Calls A2.__lt__(A1) and A2.__eq__(A1) 1000000 loops, best of 20: 820 ns per loop sage: timeit('B2 <= B1', number=1000000, repeat=20) 1000000 loops, best of 20: 293 ns per loop
As one might expect, the time is roughly linear in the number of Python method calls.
comment:12 Changed 5 years ago by
- Reviewers set to Travis Scrimshaw
Thank you for running those timings. That is more than enough to convince me. If tests pass, you can set a positive review on my behalf. Sleepy time for me now. :)
comment:13 Changed 5 years ago by
- Commit changed from 47f97ec88f373697faafeb84579becb064213dc1 to acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
Branch pushed to git repo; I updated commit sha1. New commits:
acfd0aa | Typo
|
comment:14 Changed 5 years ago by
- Status changed from needs_review to positive_review
comment:15 Changed 5 years ago by
- Branch changed from u/jdemeyer/ticket/23102 to acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
- Resolution set to fixed
- Status changed from positive_review to closed
comment:16 Changed 5 years ago by
- Commit acfd0aaea4de51b64c21fdd1a7a8d6e9174448de deleted
- Resolution fixed deleted
- Status changed from closed to new
This happened randomly:
sage -t --long src/sage/sets/real_set.py ********************************************************************** File "src/sage/sets/real_set.py", line 701, in sage.sets.real_set.RealSet.__richcmp__ Failed example: cmp(I1, I2) Expected: 1 Got: -1 ********************************************************************** File "src/sage/sets/real_set.py", line 703, in sage.sets.real_set.RealSet.__richcmp__ Failed example: sorted([I1, I2]) Expected: [(0, 5], (1, 3]] Got: [(1, 3], (0, 5]] ********************************************************************** 1 item had failures: 2 of 6 in sage.sets.real_set.RealSet.__richcmp__ [203 tests, 2 failures, 0.55 s]
comment:17 Changed 5 years ago by
Confirmed. It seems that the category framework gets in the way. Commenting out this line fixes the problem
Parent.__init__(self, category = Sets())
comment:18 follow-up: ↓ 19 Changed 5 years ago by
I really would not want to go backwards in category initalization for parents. Although the category methods should not be interfering. I don't see how they would cause problems...
comment:19 in reply to: ↑ 18 ; follow-up: ↓ 21 Changed 5 years ago by
Replying to tscrim:
I really would not want to go backwards in category initalization for parents.
Sorry, I didn't say that we should remove that line. Just that removing that line fixes this specific problem that __richcmp__
doesn't work.
comment:20 follow-up: ↓ 22 Changed 5 years ago by
OK, the real problem is that the richcmp_method
decorator doesn't work with subclassing. It only works on the class where it is defined, not on subclasses. Given that the category framework creates a RealSet_with_category
subclass, this breaks.
comment:21 in reply to: ↑ 19 Changed 5 years ago by
Replying to jdemeyer:
Replying to tscrim:
I really would not want to go backwards in category initalization for parents.
Sorry, I didn't say that we should remove that line. Just that removing that line fixes this specific problem that
__richcmp__
doesn't work.
I didn't mean to imply you were. I just wanted to make it clear that we didn't want to do this in case someone else came along and saw this.
comment:22 in reply to: ↑ 20 Changed 5 years ago by
Replying to jdemeyer:
OK, the real problem is that the
richcmp_method
decorator doesn't work with subclassing. It only works on the class where it is defined, not on subclasses. Given that the category framework creates aRealSet_with_category
subclass, this breaks.
Perhaps the way to go would be to bind __eq__
and friends to a particular class that compare with the __richcmp__
?
comment:23 Changed 5 years ago by
OK, I think the problem is that my trick doesn't add slot wrappers (attributes like __eq__
on the class). I'll try to add those wrappers and see if that works.
comment:24 Changed 5 years ago by
- Branch changed from acfd0aaea4de51b64c21fdd1a7a8d6e9174448de to u/jdemeyer/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
comment:25 Changed 5 years ago by
- Commit set to f52d5c341e965716f799530424cf242c6a5547c2
- Status changed from new to needs_review
comment:26 Changed 5 years ago by
- Cc embray added
- Status changed from needs_review to positive_review
If I understand correctly what is going on, then yes, this should work in general. So I'm going to set it to a positive review since it works AFAIK.
Erik, I'm cc-ing you in case you know something about the inner workings of Python and can confirm the code looks proper. If you don't, then no need to respond.
comment:27 Changed 5 years ago by
Of course, none of this slot stuff is documented anywhere in the Python docs... the only way to figure it out is to read the Python source code.
comment:28 follow-up: ↓ 43 Changed 5 years ago by
- Status changed from positive_review to needs_info
Why
if type(slotwrapper).__name__ != 'wrapper_descriptor':
and not explicitly something like
if type(slotwrapper) != PyWrapperDescr_Type:
?
comment:29 Changed 5 years ago by
One other minor concern: If a class has both __richcmp__
and, say, __lt__
, this will just silently override the __lt__
. For consistency with Cython, I'd say having both should be an outright error (with Cython it will just raise a compiler error if you try to define __lt__
in a cdef
class). So maybe it should raise a RuntimeError
if a class tries to define both.
But that also raises the question of what to do with subclasses. If you subclass a class with __richcmp__
and add an __lt__
in the subclass, the subclass's __lt__
should take over. I think that's what will happen anyways.
But what if you subclass a class that has __lt__
and add __richcmp__
in the subclass?
comment:30 Changed 5 years ago by
But what if you subclass a class that has
__lt__
and add__richcmp__
in the subclass?
To answer my own question, this wouldn't be a concern I guess since this will just override the parent class's __lt__
and that's to be expected.
It would be good to have a test of subclassing and overriding one of the comparison operators though.
comment:31 Changed 5 years ago by
- Branch changed from u/jdemeyer/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de to f52d5c341e965716f799530424cf242c6a5547c2
- Resolution set to fixed
- Status changed from needs_info to closed
comment:32 Changed 5 years ago by
- Commit f52d5c341e965716f799530424cf242c6a5547c2 deleted
- Resolution fixed deleted
- Status changed from closed to new
ready to merge == positive review != let somebody else look at it first, Travis
comment:33 Changed 5 years ago by
So, is it merged or not? I think Erik's comments make sense, but they could easily be addressed in a follow-up ticket.
comment:34 Changed 5 years ago by
Not merged
comment:35 Changed 5 years ago by
They could definitely be addressed in a followup ticket (if indeed my comments need addressing at all).
comment:36 Changed 5 years ago by
As an aside, I wonder if there's ever been a proposal to actually support __richcmp__
in user-defined classes (in a manner similar to this, such that it would automatically implment the individual comparison methods). And if it has been proposed I wonder why it was rejected. I'm going to search around a bit for answers...
comment:37 Changed 5 years ago by
A close reading of PEP-207 reveals:
The choice of six special methods was made over a single (e.g.
__richcmp__
) method to allow the dispatching on the opcode to be performed at the level of the C implementation rather than the user-defined method.
which makes perfect sense. But, if as Sage demonstrates, there's a usecase for a single user-defined __richcmp__
I don't see why it shouldn't be an option...
comment:38 Changed 5 years ago by
- Branch changed from f52d5c341e965716f799530424cf242c6a5547c2 to u/jdemeyer/ticket/23102
- Commit set to acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
- Status changed from new to needs_info
comment:39 Changed 5 years ago by
I'll work on it right now.
comment:40 Changed 5 years ago by
- Commit changed from acfd0aaea4de51b64c21fdd1a7a8d6e9174448de to 86e9fcf87e6d8fea38a73edc3ea6830f6fc17657
comment:41 Changed 5 years ago by
Travis: you put back the wrong branch. This is the original branch, rebased to 8.0.beta11.
comment:42 Changed 5 years ago by
Ah, sorry; I had thought u/jdemeyer/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
was just a trac formatting error.
comment:43 in reply to: ↑ 28 ; follow-up: ↓ 46 Changed 5 years ago by
Replying to embray:
if type(slotwrapper) != PyWrapperDescr_Type:
Good point. It didn't occur to me to use the C/API for this. Python doesn't expose this type anywhere AFAIK, but the C/API does.
comment:44 Changed 5 years ago by
- Commit changed from 86e9fcf87e6d8fea38a73edc3ea6830f6fc17657 to 4d1021b1e2ccd1e811549aeb1e451b9a08cc6251
Branch pushed to git repo; I updated commit sha1. New commits:
4d1021b | Further fixes and tests for richcmp_method
|
comment:45 Changed 5 years ago by
- Reviewers changed from Travis Scrimshaw to Travis Scrimshaw, Erik Bray
- Status changed from needs_info to needs_review
I think this addresses all Erik's concerns.
comment:46 in reply to: ↑ 43 Changed 5 years ago by
Replying to jdemeyer:
Replying to embray:
if type(slotwrapper) != PyWrapperDescr_Type:Good point. It didn't occur to me to use the C/API for this. Python doesn't expose this type anywhere AFAIK, but the C/API does.
You can get at it in Python, for example, with type(bytes.__lt__)
, but in Cython you might as well just use the type directly since it's exposed by the C API.
comment:47 Changed 5 years ago by
is this ready to go ?
comment:48 Changed 5 years ago by
Erik?
comment:49 Changed 5 years ago by
- Status changed from needs_review to positive_review
Yep, perfect--thanks.
comment:50 Changed 5 years ago by
- Branch changed from u/jdemeyer/ticket/23102 to 4d1021b1e2ccd1e811549aeb1e451b9a08cc6251
- Resolution set to fixed
- Status changed from positive_review to closed
What about using
@total_ordering
to reduce the number of rich comparison methods? I know this is an abuse of terminology, but it does mean we only need to define__eq__
and__lt__
. However, I generally do like the idea of having a consistent API.