Opened 22 months ago

Last modified 4 weeks ago

#26769 needs_info enhancement

Add an issequence utility to check for list, tuple, and other compatible objects — at Version 6

Reported by: embray Owned by:
Priority: major Milestone: sage-9.2
Component: misc Keywords: python3
Cc: chapoton, jdemeyer, tscrim Merged in:
Authors: Erik Bray Reviewers:
Report Upstream: N/A Work issues:
Branch: u/embray/misc/issequence (Commits) Commit: 1d7066c898c4885ff33e5385b20d5892993cbc98
Dependencies: Stopgaps:

Description (last modified by embray)

Adds an issequence() function that can work as a more generic replacement for isinstance(x, (list, tuple)) but will also work with a broader range of similar types (e.g. xrange, Sage vectors, Numpy arrays, etc). Note issequence(x) is also True for str and bytes. So when using this to replace (list, tuple) care should be taken to make sure other sequence-like types are handled first, in cases where they require separate handling in the first place.

For the very common case of (list, tuple) this is faster than isinstance, but also has the benefit of being more generic, while not quite as generic as the much slower isinstance(x, Sequence).

#24804 demonstrates an example usage which allows the constructor in question to also accept a Python 3 range object (an example which appears in the doctests that did not otherwise work). This will likely be useful elsewhere but we can find those examples as we come to them.

Change History (6)

comment:1 Changed 22 months ago by embray

  • Status changed from new to needs_review

comment:2 follow-up: Changed 22 months ago by jdemeyer

  1. What's the difference between this and isinstance(x, Sequence)? I feel like that should be explained better.
  1. There is no point in adding an empty .pyx file (unless I'm missing something)

comment:3 follow-up: Changed 22 months ago by jdemeyer

  1. Is it possible for a subclass of list/tuple not to be a sequence? Just asking because you use FOO_CheckExact as opposed to FOO_Check.

comment:4 in reply to: ↑ 2 Changed 22 months ago by embray

Replying to jdemeyer:

  1. What's the difference between this and isinstance(x, Sequence)? I feel like that should be explained better.

These are technically completely different things. The C-API level "sequence protocol" is defined by implementing some or all of the PySequenceMethods (the way PySequence_Check is implemented all it cares about is that sq_item by implemented at a minimum).

This is different from the Python level collections.abc "Sequence" ABC which requires that both __getitem__ and __len__ are implemented.

This inconsistent overloading of terms is irritating and confusing, but it is what it is. So this interface defines my own sort of middle-ground which is closer to the ABC in that it requires both the C-level sequence interface (which Python classes that implement __getitem__ have), but also that PySequence_Length works and returns a non-negative integer. There is an assumption then that obj[n] will work for [0:len(obj)]. Unfortunately, for types defined at the Python level there is no obvious way to explicitly distinguish between a "sequence" and a "mapping".

A class that implements a custom __getitem__ may work as one or the other, or both, so the best we can do is check for __getitem__ and __len__ and hope it works "like a list", which is how I'm defining "sequence" in this case.

We special case tuple and list here since they are going to be quite common and are known to be the two most fundamental sequence types built into Python (as opposed to dict, which is not considered a sequence).

  1. There is no point in adding an empty .pyx file (unless I'm missing something)

If you don't have the .pyx file Cython won't actually generate a Python module that can be used by Python. You can still cimport cpdef functions defined in a .pxd file from other Cython code, but you can't use it in Python :/

Last edited 22 months ago by embray (previous) (diff)

comment:5 in reply to: ↑ 3 Changed 22 months ago by embray

Replying to jdemeyer:

  1. Is it possible for a subclass of list/tuple not to be a sequence? Just asking because you use FOO_CheckExact as opposed to FOO_Check.

From the C API perspective a subclass will still technically be a sequence, but there's no guarantee that it isn't broken in some way, and it's still necessary in those cases to go through the slower but more generic PySequence_ITEM calls, as opposed to PyList_GET_ITEM when accessing items, for example.

Likewise in issequence, for subclasses it's kind of necessary to go through the slightly slower path of calling PySequence_Length, and you only add overhead to the more common "exact" case, while gaining nothing.

comment:6 Changed 22 months ago by embray

  • Description modified (diff)
Note: See TracTickets for help on using tickets.