#24244 closed enhancement (fixed)
Fast check for C long
Reported by: | jdemeyer | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.1 |
Component: | cython | Keywords: | |
Cc: | Merged in: | ||
Authors: | Jeroen Demeyer | Reviewers: | Travis Scrimshaw |
Report Upstream: | N/A | Work issues: | |
Branch: | 70592a8 (Commits, GitHub, GitLab) | Commit: | |
Dependencies: | #24252 | Stopgaps: |
Description (last modified by )
Implement a fast variant of pyobject_to_long
, which could be used to replace checks of the form isinstance(x, int)
.
The goal of these functions is to provide a single place to check for integers and convert them to a C long
(at least in Cython code, where performance matters most). Various places in Sage check for integers in different ways. First of all, this is an inconsistency which should be fixed.
In Python 3, the types int
/long
are changed, so many of the existing checks will need to be changed anyway. So we might as well provide one function which should be able to replace most of these checks. And ideally, we want to do this without losing too much performance. For that reason, I avoid PyLong_FromLong
in favour of a hand-written implementation.
The function pyobject_to_long
has signature
cdef bint integer_check_long(x, long* value, int* err)
It does several things: first of all, it checks whether x
is "integer-like". This means that it is either a Python int
or long
, a Sage Integer
or it supports __index__
. The return value of the function is whether or not it is an integer. If it is an integer, it tries to convert the value to a C long
and stores it in *value
. Errors are passed through the *err
variable.
This interface looks overcomplicated on first sight, but the complexity is needed to support the various use cases of this function. Also note that it is an inline
function, so the compiler should be able to optimize away the unused variables.
Use case 1: a function can take different kinds of input and want to check whether the input is an integer. Typically, this would be done with a long if
chain like
if isinstance(x, ...): ... elif isinstance(x, ...): ... elif integer_check_long(x, &value, &err): # Process value and err as needed elif isinstance(x, ...): ...
Use case 2: convert an integer to a C long
purely for performance. In this case, you want to treat overflow the same as the input not being an integer:
cdef int err = -1 integer_check_long(x, &value, &err) if err == 0: # Conversion successful, use value
Change History (32)
comment:1 Changed 5 years ago by
- Branch set to u/jdemeyer/fast_check_for_c_long
comment:2 Changed 5 years ago by
- Commit set to dd27cd5b59823f5f5ff7c1a80d0afc962c29dc7d
comment:3 Changed 5 years ago by
- Dependencies set to #24245
comment:4 Changed 5 years ago by
- Commit changed from dd27cd5b59823f5f5ff7c1a80d0afc962c29dc7d to b1ea838fc07baf8d6247a15f8db067ade9596e4a
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
b1ea838 | New functions integer_check_long() and integer_check_long_py()
|
comment:5 Changed 5 years ago by
- Status changed from new to needs_review
comment:6 Changed 5 years ago by
- Dependencies changed from #24245 to #24252
- Status changed from needs_review to needs_work
comment:7 Changed 5 years ago by
- Commit changed from b1ea838fc07baf8d6247a15f8db067ade9596e4a to 4d76bc133b7b93cf3a3baaad782caedee692ae0a
comment:8 Changed 5 years ago by
- Status changed from needs_work to needs_review
comment:9 follow-up: ↓ 11 Changed 5 years ago by
I think it would be better if integer_check_long
and integer_check_long_py
returned the error and 0
if it was completely successful. Then pyobject_to_long
would then interpret the error using the enum
and do the correct Python error raising. I feel like you are basically doing this, but in a roundabout way by passing err
.
comment:10 Changed 5 years ago by
Also, in #24248, you do not even use the return bint
of integer_check_long_py
.
comment:11 in reply to: ↑ 9 Changed 5 years ago by
- Description modified (diff)
comment:12 Changed 5 years ago by
- Commit changed from 4d76bc133b7b93cf3a3baaad782caedee692ae0a to 70592a850d7677928592c022760d6183bd81364f
Branch pushed to git repo; I updated commit sha1. New commits:
70592a8 | Fix compiler warning
|
comment:13 follow-ups: ↓ 14 ↓ 15 Changed 5 years ago by
Use case 2 could just become:
if integer_check_long(x, &value) == 0: # Conversion successful, use value
which feels much more natural to me.
However, I agree that use case 1 is a bit more of a hassle because you could not subsequently assign the error message (this wouldn't be a problem in, e.g., C because of the looser syntax rules). Although you can get around this by expanding out the elif
as an else-if
(but it looks fugly). Do you have a specific case where this does come up?
comment:14 in reply to: ↑ 13 Changed 5 years ago by
Replying to tscrim:
Use case 2 could just become:
if integer_check_long(x, &value) == 0: # Conversion successful, use valuewhich feels much more natural to me.
Sure, I can simplify use case 1 and use case 2 individually. But I see no way to simplify things in such a way that both use cases still work. Alternatively, I could simply add another layer for use case 2 and make a new function
if integer_check_long_no_overflow(x, &value): # Conversion successful, use value
Finally, let me stress that this complexity shouldn't affect performance since we are dealing with inline
functions and compilers are good at optimization.
comment:15 in reply to: ↑ 13 ; follow-up: ↓ 17 Changed 5 years ago by
comment:16 Changed 5 years ago by
Let me know what you would suggest to do with this ticket.
comment:17 in reply to: ↑ 15 ; follow-ups: ↓ 18 ↓ 19 Changed 5 years ago by
- Reviewers set to Travis Scrimshaw
Replying to jdemeyer:
Replying to tscrim:
Do you have a specific case where this does come up?
I don't know exactly what you mean, but you can look at my work in progress at #24247. There are various calls to
integer_check_long(_py)
there.
Do you have a place on a future ticket where use case 1 applies? I see use case 2 on #24247 and this one:
if integer_check_long_py(right, &value, &err): if not err: return (<Element>left)._pow_long(value) else: return (<Element>left)._pow_int(right)
which could be simplified to
err = integer_check_long_py(right, &value): if not err: return (<Element>left)._pow_long(value) elif err == ERR_OVERFLOW: return (<Element>left)._pow_int(right)
However, if you see a clear benefit to take on the more complicated function description and return values, then you can set this to a positive review. I just am thinking a little bit about the potential future maintenance of this function (I also don't think there will be any [noticeable] performance difference) and the places where it is used.
comment:18 in reply to: ↑ 17 Changed 5 years ago by
Here is a concrete proposal: we keep the complicated signature for now (if anything, simply to prevent conflicts with current tickets depending on this one). Once we have a clearer idea on how these functions are used, we can revisit and change things if needed.
comment:19 in reply to: ↑ 17 Changed 5 years ago by
Replying to tscrim:
I also don't think there will be any [noticeable] performance difference
Fun observation: compiler warnings can be used to get data on how the compiler is able to optimize the code. For example, in #24247 I'm writing essentially
cdef long value cdef int err if integer_check_long(y, &value, &err): if not err: return (<Element>x)._pow_long(value)
In this case, the compiler does not give a warning that err
and value
might be uninitialized. This means that the compiler is able to inline and optimize all code paths of integer_check_long
to prove that err
and value
are set to a known value whenever they are accessed in the code snippet above.
comment:20 Changed 5 years ago by
- Status changed from needs_review to positive_review
comment:21 follow-up: ↓ 22 Changed 5 years ago by
Oh, I just saw this from #24293 (I don't look at every ticket opened...)
As for "integer_fake" is there maybe a name for it that better reflects its purpose"? I was thinking something like "integer_proto" (for prototype) since providing early access to the Integer prototype seems to be its primary purpose for existing. "integer_fake" to me sounds like something for testing, or some other odd purpose.
comment:22 in reply to: ↑ 21 Changed 5 years ago by
Replying to embray:
As for "integer_fake" is there maybe a name for it that better reflects its purpose"? I was thinking something like "integer_proto" (for prototype) since providing early access to the Integer prototype seems to be its primary purpose for existing.
I think that integer_fake
or integer_proto
are equally meaningless. What would "prototype" mean in this context? I don't mind changing the name, but I'm not convinced with integer_proto
either.
"integer_fake" to me sounds like something for testing, or some other odd purpose.
"odd purpose" totally applies here :-)
comment:23 Changed 5 years ago by
Thinking more about it, I actually like how the name sounds like it should not be used. integer_proto
sounds too official somehow.
comment:24 Changed 5 years ago by
When I read proto
(type
), my first thought was the prototype design pattern, but that's right, in C it is called prototyping. However, since this is only mimicking that, I think fake
is a little more accurate as to the class's purpose.
comment:25 Changed 5 years ago by
I mean, it should be used, just for a specific narrow case. Alright I won't push on it; just something called "fake" is a little jarring to me to see outside of, say, test code.
comment:26 follow-up: ↓ 27 Changed 5 years ago by
BTW, I should add, this is really cool. Definitely something we should have done in the first place, in retrospect.
comment:27 in reply to: ↑ 26 Changed 5 years ago by
Replying to embray:
BTW, I should add, this is really cool. Definitely something we should have done in the first place, in retrospect.
Thanks!
comment:28 follow-up: ↓ 29 Changed 5 years ago by
Okay okay, what about "integer_decl", since it's basically a forward declaration IIUC.
comment:29 in reply to: ↑ 28 Changed 5 years ago by
Replying to embray:
Okay okay, what about "integer_decl", since it's basically a forward declaration IIUC.
It's a fake forward declaration which shouldn't be used.
comment:30 Changed 5 years ago by
- Branch changed from u/jdemeyer/fast_check_for_c_long to 70592a850d7677928592c022760d6183bd81364f
- Resolution set to fixed
- Status changed from positive_review to closed
comment:31 Changed 4 years ago by
- Branch changed from 70592a850d7677928592c022760d6183bd81364f to u/jmantysalo/70592a850d7677928592c022760d6183bd81364f
comment:32 Changed 4 years ago by
- Branch changed from u/jmantysalo/70592a850d7677928592c022760d6183bd81364f to 70592a850d7677928592c022760d6183bd81364f
- Commit 70592a850d7677928592c022760d6183bd81364f deleted
Arghs, my typo.
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
Move long.pxd to arith
New function integer_check_long()