Opened 3 years ago

Closed 3 years ago

#23102 closed enhancement (fixed)

Support __richcmp__ in Python classes

Reported by: jdemeyer Owned by:
Priority: major Milestone: sage-8.0
Component: python3 Keywords:
Cc: chapoton, tscrim, embray Merged in:
Authors: Jeroen Demeyer Reviewers: Travis Scrimshaw, Erik Bray
Report Upstream: N/A Work issues:
Branch: 4d1021b (Commits) Commit: 4d1021b1e2ccd1e811549aeb1e451b9a08cc6251
Dependencies: #23103 Stopgaps:

Description (last modified by jdemeyer)

Sage has a lot of functionality to deal with "rich comparisons" using __richcmp__ in Cython or _richcmp_ in Python or Cython for subclasses of Element. However, we are still missing rich comparison support for Python classes which are not subclasses from Element.

A good example of this is RealSet in src/sage/sets/real_set.py. Currently, comparison is implemented using __cmp__. Changing this to Python 3 comparisons would normally require 6 methods (__eq__, __ne__, __lt__, __le__, __gt__, __ge__). It would be convenient to support this with just one __richcmp__ method like in Cython.

This ticket proposes:

  1. Add a decorator richcmp_method which adds support for __richcmp__ to a Python class.
  1. Implement RealSet.__richcmp__ using this.

Attachments (1)

cmp-timing.py (550 bytes) - added by jdemeyer 3 years ago.
Small script to test timing

Download all attachments as: .zip

Change History (51)

comment:1 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:2 follow-up: Changed 3 years ago by tscrim

What about using @total_ordering to reduce the number of rich comparison methods? I know this is an abuse of terminology, but it does mean we only need to define __eq__ and __lt__. However, I generally do like the idea of having a consistent API.

comment:3 Changed 3 years ago by tscrim

  • Cc tscrim added

comment:4 Changed 3 years ago by jdemeyer

  • Branch set to u/jdemeyer/ticket/23102

comment:5 in reply to: ↑ 2 Changed 3 years ago by jdemeyer

  • Commit set to 47f97ec88f373697faafeb84579becb064213dc1

Replying to tscrim:

What about using @total_ordering to reduce the number of rich comparison methods?

  1. @total_ordering reduces the number of rich comparison methods from 6 to 2. This ticket reduces it from 6 to 1.
  1. @total_ordering is less efficient (if you only have == and <, you might need two calls to determine <=)

New commits:

256bf19Move richcmp stuff to new file richcmp.pyx
b591b78Support __richcmp__ in Python classes
47f97ecImplement RealSet comparisons using __richcmp__

comment:6 Changed 3 years ago by jdemeyer

  • Dependencies set to #23103
  • Status changed from new to needs_review

comment:7 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:8 follow-up: Changed 3 years ago by chapoton

Note that from #22828, there is

def richcmp_by_eq_and_lt(left, right, op):

in

src/sage/rings/asymptotic/misc.py

that should also be moved to the common new place.

comment:9 in reply to: ↑ 8 Changed 3 years ago by jdemeyer

Replying to chapoton:

Note that from #22828, there is

def richcmp_by_eq_and_lt(left, right, op):

in

src/sage/rings/asymptotic/misc.py

that should also be moved to the common new place.

Makes sense. I created #23109 for that.

comment:10 follow-up: Changed 3 years ago by tscrim

Typo in richcmp_method: mtehod. Have you run some timings to compare how this works versus direct implementation versus @total_ordering in Python classes? I can do this, but not for 2-3 days because I will be traveling.

comment:11 in reply to: ↑ 10 Changed 3 years ago by jdemeyer

Replying to tscrim:

Have you run some timings to compare how this works versus direct implementation versus @total_ordering in Python classes? I can do this, but not for 2-3 days because I will be traveling.

I just did that, see cmp-timing.py

First of all, for a plain equality check (@total_ordering is not involved in this), the timing is almost the same for __eq__ and __richcmp__, the latter being just slightly slower:

sage: load("cmp-timing.py")
sage: timeit('A1 == A2', number=1000000, repeat=100)
1000000 loops, best of 100: 281 ns per loop
sage: timeit('B1 == B2', number=1000000, repeat=100)
1000000 loops, best of 100: 293 ns per loop

Now, comparisons involving @total_ordering take almost twice or three times as much time:

sage: load("cmp-timing.py")
sage: timeit('A1 > A2', number=1000000, repeat=20)   # calls A2.__lt__(A1)
1000000 loops, best of 20: 557 ns per loop
sage: timeit('B1 > B2', number=1000000, repeat=20)
1000000 loops, best of 20: 292 ns per loop
sage: load("cmp-timing.py")
sage: timeit('A2 <= A1', number=1000000, repeat=20)  # Calls A2.__lt__(A1) and A2.__eq__(A1)
1000000 loops, best of 20: 820 ns per loop
sage: timeit('B2 <= B1', number=1000000, repeat=20)
1000000 loops, best of 20: 293 ns per loop

As one might expect, the time is roughly linear in the number of Python method calls.

Changed 3 years ago by jdemeyer

Small script to test timing

comment:12 Changed 3 years ago by tscrim

  • Reviewers set to Travis Scrimshaw

Thank you for running those timings. That is more than enough to convince me. If tests pass, you can set a positive review on my behalf. Sleepy time for me now. :)

comment:13 Changed 3 years ago by git

  • Commit changed from 47f97ec88f373697faafeb84579becb064213dc1 to acfd0aaea4de51b64c21fdd1a7a8d6e9174448de

Branch pushed to git repo; I updated commit sha1. New commits:

acfd0aaTypo

comment:14 Changed 3 years ago by jdemeyer

  • Status changed from needs_review to positive_review

comment:15 Changed 3 years ago by vbraun

  • Branch changed from u/jdemeyer/ticket/23102 to acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:16 Changed 3 years ago by vbraun

  • Commit acfd0aaea4de51b64c21fdd1a7a8d6e9174448de deleted
  • Resolution fixed deleted
  • Status changed from closed to new

This happened randomly:

sage -t --long src/sage/sets/real_set.py
**********************************************************************
File "src/sage/sets/real_set.py", line 701, in sage.sets.real_set.RealSet.__richcmp__
Failed example:
    cmp(I1, I2)
Expected:
    1
Got:
    -1
**********************************************************************
File "src/sage/sets/real_set.py", line 703, in sage.sets.real_set.RealSet.__richcmp__
Failed example:
    sorted([I1, I2])
Expected:
    [(0, 5], (1, 3]]
Got:
    [(1, 3], (0, 5]]
**********************************************************************
1 item had failures:
   2 of   6 in sage.sets.real_set.RealSet.__richcmp__
    [203 tests, 2 failures, 0.55 s]

comment:17 Changed 3 years ago by jdemeyer

Confirmed. It seems that the category framework gets in the way. Commenting out this line fixes the problem

Parent.__init__(self, category = Sets())

comment:18 follow-up: Changed 3 years ago by tscrim

I really would not want to go backwards in category initalization for parents. Although the category methods should not be interfering. I don't see how they would cause problems...

comment:19 in reply to: ↑ 18 ; follow-up: Changed 3 years ago by jdemeyer

Replying to tscrim:

I really would not want to go backwards in category initalization for parents.

Sorry, I didn't say that we should remove that line. Just that removing that line fixes this specific problem that __richcmp__ doesn't work.

comment:20 follow-up: Changed 3 years ago by jdemeyer

OK, the real problem is that the richcmp_method decorator doesn't work with subclassing. It only works on the class where it is defined, not on subclasses. Given that the category framework creates a RealSet_with_category subclass, this breaks.

comment:21 in reply to: ↑ 19 Changed 3 years ago by tscrim

Replying to jdemeyer:

Replying to tscrim:

I really would not want to go backwards in category initalization for parents.

Sorry, I didn't say that we should remove that line. Just that removing that line fixes this specific problem that __richcmp__ doesn't work.

I didn't mean to imply you were. I just wanted to make it clear that we didn't want to do this in case someone else came along and saw this.

comment:22 in reply to: ↑ 20 Changed 3 years ago by tscrim

Replying to jdemeyer:

OK, the real problem is that the richcmp_method decorator doesn't work with subclassing. It only works on the class where it is defined, not on subclasses. Given that the category framework creates a RealSet_with_category subclass, this breaks.

Perhaps the way to go would be to bind __eq__ and friends to a particular class that compare with the __richcmp__?

comment:23 Changed 3 years ago by jdemeyer

OK, I think the problem is that my trick doesn't add slot wrappers (attributes like __eq__ on the class). I'll try to add those wrappers and see if that works.

comment:24 Changed 3 years ago by jdemeyer

  • Branch changed from acfd0aaea4de51b64c21fdd1a7a8d6e9174448de to u/jdemeyer/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de

comment:25 Changed 3 years ago by jdemeyer

  • Commit set to f52d5c341e965716f799530424cf242c6a5547c2
  • Status changed from new to needs_review

New commits:

b591b78Support __richcmp__ in Python classes
47f97ecImplement RealSet comparisons using __richcmp__
acfd0aaTypo
1569ddaMerge tag '8.0.beta10' into t/23102/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
f52d5c3Install slot wrappers for rich comparison methods

comment:26 Changed 3 years ago by tscrim

  • Cc embray added
  • Status changed from needs_review to positive_review

If I understand correctly what is going on, then yes, this should work in general. So I'm going to set it to a positive review since it works AFAIK.

Erik, I'm cc-ing you in case you know something about the inner workings of Python and can confirm the code looks proper. If you don't, then no need to respond.

comment:27 Changed 3 years ago by jdemeyer

Of course, none of this slot stuff is documented anywhere in the Python docs... the only way to figure it out is to read the Python source code.

comment:28 follow-up: Changed 3 years ago by embray

  • Status changed from positive_review to needs_info

Why

if type(slotwrapper).__name__ != 'wrapper_descriptor':

and not explicitly something like

if type(slotwrapper) != PyWrapperDescr_Type:

?

comment:29 Changed 3 years ago by embray

One other minor concern: If a class has both __richcmp__ and, say, __lt__, this will just silently override the __lt__. For consistency with Cython, I'd say having both should be an outright error (with Cython it will just raise a compiler error if you try to define __lt__ in a cdef class). So maybe it should raise a RuntimeError if a class tries to define both.

But that also raises the question of what to do with subclasses. If you subclass a class with __richcmp__ and add an __lt__ in the subclass, the subclass's __lt__ should take over. I think that's what will happen anyways.

But what if you subclass a class that has __lt__ and add __richcmp__ in the subclass?

Last edited 3 years ago by embray (previous) (diff)

comment:30 Changed 3 years ago by embray

But what if you subclass a class that has __lt__ and add __richcmp__ in the subclass?

To answer my own question, this wouldn't be a concern I guess since this will just override the parent class's __lt__ and that's to be expected.

It would be good to have a test of subclassing and overriding one of the comparison operators though.

Last edited 3 years ago by embray (previous) (diff)

comment:31 Changed 3 years ago by vbraun

  • Branch changed from u/jdemeyer/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de to f52d5c341e965716f799530424cf242c6a5547c2
  • Resolution set to fixed
  • Status changed from needs_info to closed

comment:32 Changed 3 years ago by vbraun

  • Commit f52d5c341e965716f799530424cf242c6a5547c2 deleted
  • Resolution fixed deleted
  • Status changed from closed to new

ready to merge == positive review != let somebody else look at it first, Travis

comment:33 Changed 3 years ago by jdemeyer

So, is it merged or not? I think Erik's comments make sense, but they could easily be addressed in a follow-up ticket.

comment:34 Changed 3 years ago by vbraun

Not merged

comment:35 Changed 3 years ago by embray

They could definitely be addressed in a followup ticket (if indeed my comments need addressing at all).

comment:36 Changed 3 years ago by embray

As an aside, I wonder if there's ever been a proposal to actually support __richcmp__ in user-defined classes (in a manner similar to this, such that it would automatically implment the individual comparison methods). And if it has been proposed I wonder why it was rejected. I'm going to search around a bit for answers...

comment:37 Changed 3 years ago by embray

A close reading of PEP-207 reveals:

The choice of six special methods was made over a single (e.g. __richcmp__) method to allow the dispatching on the opcode to be performed at the level of the C implementation rather than the user-defined method.

which makes perfect sense. But, if as Sage demonstrates, there's a usecase for a single user-defined __richcmp__ I don't see why it shouldn't be an option...

comment:38 Changed 3 years ago by tscrim

  • Branch changed from f52d5c341e965716f799530424cf242c6a5547c2 to u/jdemeyer/ticket/23102
  • Commit set to acfd0aaea4de51b64c21fdd1a7a8d6e9174448de
  • Status changed from new to needs_info

As Jeroen and Erik said, I felt it could be merged into Sage and further changes could be handled on followup tickets.


New commits:

b591b78Support __richcmp__ in Python classes
47f97ecImplement RealSet comparisons using __richcmp__
acfd0aaTypo

comment:39 Changed 3 years ago by jdemeyer

I'll work on it right now.

comment:40 Changed 3 years ago by git

  • Commit changed from acfd0aaea4de51b64c21fdd1a7a8d6e9174448de to 86e9fcf87e6d8fea38a73edc3ea6830f6fc17657

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

49b6f76Support __richcmp__ in Python classes
d343d9dImplement RealSet comparisons using __richcmp__
3aac3ccTypo
86e9fcfInstall slot wrappers for rich comparison methods

comment:41 Changed 3 years ago by jdemeyer

Travis: you put back the wrong branch. This is the original branch, rebased to 8.0.beta11.

comment:42 Changed 3 years ago by tscrim

Ah, sorry; I had thought u/jdemeyer/acfd0aaea4de51b64c21fdd1a7a8d6e9174448de was just a trac formatting error.

comment:43 in reply to: ↑ 28 ; follow-up: Changed 3 years ago by jdemeyer

Replying to embray:

if type(slotwrapper) != PyWrapperDescr_Type:

Good point. It didn't occur to me to use the C/API for this. Python doesn't expose this type anywhere AFAIK, but the C/API does.

comment:44 Changed 3 years ago by git

  • Commit changed from 86e9fcf87e6d8fea38a73edc3ea6830f6fc17657 to 4d1021b1e2ccd1e811549aeb1e451b9a08cc6251

Branch pushed to git repo; I updated commit sha1. New commits:

4d1021bFurther fixes and tests for richcmp_method

comment:45 Changed 3 years ago by jdemeyer

  • Reviewers changed from Travis Scrimshaw to Travis Scrimshaw, Erik Bray
  • Status changed from needs_info to needs_review

I think this addresses all Erik's concerns.

comment:46 in reply to: ↑ 43 Changed 3 years ago by embray

Replying to jdemeyer:

Replying to embray:

if type(slotwrapper) != PyWrapperDescr_Type:

Good point. It didn't occur to me to use the C/API for this. Python doesn't expose this type anywhere AFAIK, but the C/API does.

You can get at it in Python, for example, with type(bytes.__lt__), but in Cython you might as well just use the type directly since it's exposed by the C API.

comment:47 Changed 3 years ago by chapoton

is this ready to go ?

comment:48 Changed 3 years ago by jdemeyer

Erik?

comment:49 Changed 3 years ago by embray

  • Status changed from needs_review to positive_review

Yep, perfect--thanks.

comment:50 Changed 3 years ago by vbraun

  • Branch changed from u/jdemeyer/ticket/23102 to 4d1021b1e2ccd1e811549aeb1e451b9a08cc6251
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.