Opened 5 years ago
Last modified 9 days ago
#12121 needs_review defect
floor/ceil can be very slow at integral values
Reported by:  dsm  Owned by:  AlexGhitza 

Priority:  major  Milestone:  sage7.3 
Component:  basic arithmetic  Keywords:  
Cc:  kcrisman, jdemeyer  Merged in:  
Authors:  Vincent Delecroix  Reviewers:  Marc Mezzarobba, Ralf Stephan 
Report Upstream:  N/A  Work issues:  
Branch:  u/vdelecroix/12121 (Commits)  Commit:  f66febd75b8702831db61585d3591575d024c23b 
Dependencies:  Stopgaps: 
Description (last modified by )
Reported (in slightly different form) on ask.sagemath.org:
sage: %timeit floor(log(3)/log(2)) 625 loops, best of 3: 586 µs per loop sage: %timeit floor(log(4)/log(2)) 5 loops, best of 3: 3.79 s per loop
This happens because ceil and floor first try to increase the precision of a coercion of the input argument to a RealInterval
by 100 bits from 53 to 20000 before finally trying a full_simplify, which succeeds. The RealInterval
rounds all fail because the interval is always of the form (2  epsilon, 2 + epsilon) and endpoints have different ceilings.
With the branch applied math.floor
and numpy.floor
are used directly
sage: floor(1.2r) 1.0 sage: type(_) <type 'float'>
which is distinct from the current Sage behavior
sage: floor(1.2r) 1 sage: type(_) <type 'sage.rings.integer.Integer'>
Change History (61)
comment:1 Changed 5 years ago by
 Description modified (diff)
comment:2 Changed 5 years ago by
 Description modified (diff)
comment:3 Changed 4 years ago by
comment:4 Changed 4 years ago by
Apparently is_int is deprecated.
comment:5 Changed 3 years ago by
 Milestone changed from sage5.11 to sage5.12
comment:6 Changed 3 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:7 Changed 2 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:8 Changed 2 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:9 followup: ↓ 13 Changed 14 months ago by
I'm not sure if this should go to this ticket, but the following never returns:
sage: z (11/9*sqrt(3)*sqrt(2) + 3)^(1/3) + 1/3/(11/9*sqrt(3)*sqrt(2) + 3)^(1/3)  1 sage: floor(z)
Even floor(z, maximum_bits=53)
loops infinitely.
Should I open a separate ticket?
comment:10 Changed 7 months ago by
 Branch set to u/ajagekar.akshay/Trac12121
comment:11 Changed 7 months ago by
 Commit set to 6b75d32ab5cc279a8566f0bc67acc2501bbb833e
 Status changed from new to needs_review
New commits:
6b75d32  Trac #12121: floor/ceil can be very slow at integral values

comment:12 Changed 6 months ago by
 Branch u/ajagekar.akshay/Trac12121 deleted
 Commit 6b75d32ab5cc279a8566f0bc67acc2501bbb833e deleted
comment:13 in reply to: ↑ 9 Changed 6 months ago by
Replying to zimmerma:
I'm not sure if this should go to this ticket, but the following never returns:
sage: z (11/9*sqrt(3)*sqrt(2) + 3)^(1/3) + 1/3/(11/9*sqrt(3)*sqrt(2) + 3)^(1/3)  1 sage: floor(z)Even
floor(z, maximum_bits=53)
loops infinitely.
Whereas the following actually works
sage: bool(z == 1) True
Should I open a separate ticket?
I think that "very slow" includes "infinite amount of time". For me it is worth it to also fix this kind of endless loops in this ticket.
comment:14 followup: ↓ 15 Changed 6 months ago by
 Milestone changed from sage6.4 to sage7.1
 Status changed from needs_review to needs_work
@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).
comment:15 in reply to: ↑ 14 ; followup: ↓ 16 Changed 6 months ago by
Replying to vdelecroix:
@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).
Sorry but I pushed that branch without testing, some tests failed.
comment:16 in reply to: ↑ 15 Changed 6 months ago by
Replying to ajagekar.akshay:
Replying to vdelecroix:
@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).
Sorry but I pushed that branch without testing, some tests failed.
That is not a problem. Tickets can have different status: closed
, positive_review
, needs_review
, needs_work
, new
. If there is no branch associated to a ticket it makes no sense to keep it into "needs review" state. (I already changed it to needs_work)
comment:17 Changed 6 months ago by
Indeed, your code was good... and there is a bug somewhere else in the conversion from SR
to the real interval fields:
sage: RealIntervalField(256)(SR(10^50 + 10^(50))).is_int() (True, 100000000000000000000000000000000000000000000000000)
Sorry, I was wrong. The method is_int
is not intended to test if the interval is actually restricted to one integer! It only tests if the interval contains only one integer.
comment:18 Changed 6 months ago by
 Branch set to u/vdelecroix/12121
 Commit set to 9821cadaa866f28a0178ce4c75b03dbc82aabd30
 Status changed from needs_work to needs_review
New commits:
9821cad  Trac 12121: fix floor/ceil functions

comment:19 Changed 6 months ago by
 Commit changed from 9821cadaa866f28a0178ce4c75b03dbc82aabd30 to 800d0ee77a1628f92faaf1b1e159d4aa3f1ed966
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
800d0ee  Trac 12121: fix floor/ceil functions

comment:20 Changed 6 months ago by
 Commit changed from 800d0ee77a1628f92faaf1b1e159d4aa3f1ed966 to 50bdaf3b31aecc326045e48524d2aa4d98186549
Branch pushed to git repo; I updated commit sha1. New commits:
50bdaf3  Trac 12121: work around a bug with symbolics

comment:21 Changed 6 months ago by
I pushed a tiny fix because we currently have
sage: log(8) / log(2) == 3 False
See the discussion at this sagedevel thread.
comment:22 Changed 5 months ago by
#15786 is a partial duplicate. I find the fix posted there a bit cleaner, though it needs to be rebased.
comment:23 Changed 5 months ago by
Rebased on what!?
There is one thing I am not happy with. I think we should try to make the interval much smaller than 1 in
while x_interval.absolute_diameter() >= 1: bits *= 2 x_interval = RealIntervalField(bits)(x)
before trying to simplify the expression. I guess that 2^{30} would be much more reasonable and avoid many attempts of (costly) simplification. What do you think?
comment:24 Changed 5 months ago by
Sorry if I wasn't clear. I'm saying that the other branch (the one at #15786) should be rebased.
comment:25 Changed 5 months ago by
Be aware you might be using symbolics when doing bool( == ). To not end up with surprises you might want to break out the actual functionality you need out of Expression.__nonzero__
or review #16397 and use cmp
.
comment:26 Changed 5 months ago by
 Commit changed from 50bdaf3b31aecc326045e48524d2aa4d98186549 to 688ebfa077f09ee9a2bc988833ddd4f0fc65a4a1
Branch pushed to git repo; I updated commit sha1. New commits:
688ebfa  Trac 12121: code improvements

comment:27 Changed 5 months ago by
Replying to rws:
Be aware you might be using symbolics when doing bool( == ). To not end up with surprises you might want to break out the actual functionality you need out of
Expression.__nonzero__
or review #16397 and usecmp
.
You mean that there is no way to check that expr1 == expr2
? I do not want to copy/paste anything from other place. If testing equality is not available, there is a big problem.
Using cmp
for that purpose is a bad idea. The purpose of cmp
in Python 2 is to sort things out. Not to compare.
comment:28 followup: ↓ 29 Changed 5 months ago by
Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy: convert symbolic constants and function expressions to float as in N()
, compare. Of course there are precision problems but that's nothing in comparison to what you get with variables.
The idea to abuse cmp
is not mine. Somewhere here RLF(1) < RLF(sqrt(2))
for example, symbolic cmp
is called. Should I file a bug report for such usage?
EDIT: typos
comment:29 in reply to: ↑ 28 Changed 5 months ago by
Replying to rws:
Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy
It is undecidable even without variables.
The idea to abuse
cmp
is not mine. Somewhere hereRLF(1) < RLF(sqrt(2))
for example, symboliccmp
is called. Should I file a bug report for such usage?
I'd say it probably is a bug. As to whether you should file a bug report, I don't know—I doubt anyone really reads them, and the issue is probably already covered somewhere in the myriad of known bugs with comparisons...
comment:30 followup: ↓ 31 Changed 5 months ago by
Replying to rws:
Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy: convert symbolic constants and function expressions to float as in
N()
, compare. Of course there are precision problems but that's nothing in comparison to what you get with variables.
What I need is a method that either returns a reliable answer True
or False
or raise an error which can either be This comparison is meaningless
or I don't know how to compare this
.
comment:31 in reply to: ↑ 30 ; followup: ↓ 33 Changed 5 months ago by
Replying to vdelecroix:
What I need is a method that either returns a reliable answer
True
orFalse
or raise an error which can either beThis comparison is meaningless
orI don't know how to compare this
.
Then bool( == )
is right for the moment and may be replaced with #19040. As said don't be surprised if it takes a long time.
comment:32 Changed 5 months ago by
 Commit changed from 688ebfa077f09ee9a2bc988833ddd4f0fc65a4a1 to a5ef94a5b27b302d813ac83538558b39a39c7267
Branch pushed to git repo; I updated commit sha1. New commits:
a5ef94a  Trac 12121: note about #19040 + extra if

comment:33 in reply to: ↑ 31 Changed 5 months ago by
Replying to rws:
Replying to vdelecroix:
What I need is a method that either returns a reliable answer
True
orFalse
or raise an error which can either beThis comparison is meaningless
orI don't know how to compare this
.Then
bool( == )
is right for the moment and may be replaced with #19040. As said don't be surprised if it takes a long time.
I added a note about #19040.
comment:34 Changed 5 months ago by
 Description modified (diff)
 Milestone changed from sage7.1 to sage7.2
comment:35 Changed 5 months ago by
 Reviewers set to Marc Mezzarobba
 Status changed from needs_review to needs_work
Hi Vincent,
Sorry for my unclear comments above, I didn't look closely enough at your code before posting them.
Beside the issue with comparisons raised by Ralf, I find the code on your branch a bit complicated, and I don't like the fact that you drop the maximum_bits
parameters at the risk of looping forever if you cannot prove that the input is an integer. I pushed to u/mmezzarobba/12121ceil
a rough attempt to fix these issues (in the case of ceil
only at the moment, and not well tested yet), please tell me what you think of it.
comment:36 Changed 5 months ago by
 Commit changed from a5ef94a5b27b302d813ac83538558b39a39c7267 to 60f0c291adf28344156661ea7623c5721ed4a06b
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
60f0c29  Trac 12121: Fix floor/ceil

comment:37 Changed 5 months ago by
 Status changed from needs_work to needs_review
All right. I factorized the two implementations in a new function incremental_rounding
. It is cleaner and the parameter maximum_bits
is reintroduced.
comment:38 Changed 5 months ago by
 Commit changed from 60f0c291adf28344156661ea7623c5721ed4a06b to d53c406171d73a0138d25336732728b41409c8b9
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
d53c406  Trac 12121: Fix floor/ceil

comment:39 Changed 5 months ago by
I fear I won't have time to review your new version in the next few days at least, but from a quick look at it there are a lot of things I don't understand. In no particular order:
 why do you make
maximum_bits
anInteger
?  what don't you like about
unique_integer()
?  is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?
 why do you insist on using
==
on symbolic expressions instead ofis_trivial_zero()
?  are you sure you want to raise an error when
maximum_bits
does not suffice to conclude? this is a symbolic function that may be buried deep in the middle of a symbolic expression; returning unevaluated seems more reasonable to me...  do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?
comment:40 Changed 5 months ago by
 Commit changed from d53c406171d73a0138d25336732728b41409c8b9 to 0e9b2f21d6a6bc42e35302316ac0f6b3b7451450
Branch pushed to git repo; I updated commit sha1. New commits:
0e9b2f2  Trac 12121: remove __call__ and fix round

comment:41 followups: ↓ 42 ↓ 49 Changed 5 months ago by
I removed the __call__
in both Function_floor
and Function_ceil
. The code is now much simpler. Though there was some adaptation needed in symbolic/expression.pyx
.
Replying to mmezzarobba:
I fear I won't have time to review your new version in the next few days at least, but from a quick look at it there are a lot of things I don't understand. In no particular order:
 why do you make
maximum_bits
anInteger
?
all right. int
is fine as well.
 what don't you like about
unique_integer()
?
an assert
does not cost anything. And unique_integer
silently fails if the interval does not enclose a unique integer.
 is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?
the precision of what? there is the field used for the evaluation which is different from the diameter of the interval. If you have more than one integer in your interval which one are you using to test equality?
 why do you insist on using
==
on symbolic expressions instead ofis_trivial_zero()
?
Because I want to check equality with an integer. Not if it is a trivial equality.
 are you sure you want to raise an error when
maximum_bits
does not suffice to conclude? this is a symbolic function that may be buried deep in the middle of a symbolic expression; returning unevaluated seems more reasonable to me...
Done with an example.
 do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?
The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version. On a related note, I noticed that for round
you need to test equality with elements of ZZ + 1/2
.
comment:42 in reply to: ↑ 41 Changed 5 months ago by
Replying to vdelecroix:
Replying to mmezzarobba:
 do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?
The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version. On a related note, I noticed that for
round
you need to test equality with elements ofZZ + 1/2
.
And I also would like to use the very same function incremental_rounding
for elements of QQbar
. For the very same reason, you only want very lately the equality test.
comment:43 Changed 5 months ago by
 Commit changed from 0e9b2f21d6a6bc42e35302316ac0f6b3b7451450 to 313c497daaec6324dfe4ffdeb684844e301a3f61
Branch pushed to git repo; I updated commit sha1. New commits:
313c497  Trac 12121: fix doctests

comment:44 Changed 5 months ago by
 Description modified (diff)
comment:45 Changed 4 months ago by
ping?!
comment:46 Changed 4 months ago by
Sorry, I have about zero time for Sage development before at least 12 weeks. All I can say is that I wasn't completely convinced by your answers and would need to think things over more carefully. If someone wants to review the ticket in the meantime, please do.
comment:47 Changed 4 months ago by
 Commit changed from 313c497daaec6324dfe4ffdeb684844e301a3f61 to 602e5158c5f37c6900a1dee8e27a228114af1319
comment:48 Changed 4 months ago by
rebased on 72.rc1
comment:49 in reply to: ↑ 41 ; followups: ↓ 54 ↓ 55 Changed 3 months ago by
 Status changed from needs_review to needs_work
Okay, I'm back. Sorry that it took so long.
Replying to vdelecroix:
 what don't you like about
unique_integer()
?an
assert
does not cost anything. Andunique_integer
silently fails if the interval does not enclose a unique integer.
Uh? No, it doesn't.
 is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?
the precision of what? there is the field used for the evaluation which is different from the diameter of the interval. If you have more than one integer in your interval which one are you using to test equality?
The precision of the interval computation, i.e. bits
. The idea being that if we found an interval of width (say) 2⁻²⁰ containing an integer by computing with 1000 bits of precision, we may want to see if the width of the interval keeps decreasing when the precision increases and only run the symbolic part of the algorithm if that's the case. But I agree that this is a very minor issue at best.
 why do you insist on using
==
on symbolic expressions instead ofis_trivial_zero()
?Because I want to check equality with an integer. Not if it is a trivial equality.
I mean using is_trivial_zero()
after calling full_simplify()
& co, as in my version: this is safer than relying on ==
and essentially as powerful.
 do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?
The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version.
Yes—as I said, my version was just a rough sketch of the changes I'd b tempted to make, nothing finished. As the code I was starting with used to loop forever in the typical situation where mine would do the symbolic test repeatedly (that is, x
∈ ℤ integer but we don't manage to prove it), I thought the additional cost would be acceptable. ;)
But anyway, it is not hard to do the test only once while avoiding the code duplication.
On a related note, I noticed that for
round
you need to test equality with elements ofZZ + 1/2
.
Couldn't you just compute ceil(x1/2)?
Now for some comments on the current code:
 To summarize the above, I still think the main logic in
incremental_rounding()
could be shortened to something like (not tested):unique_rounding = getattr(RealIntervalFieldElement, 'unique_' + mode) r = RR.one() >> 20 bits = 64 candidate = None while bits < maximum_bits: interval = RealIntervalField(bits)(x) # may raise a TypeError try: return unique_rounding(interval) except ValueError: pass if candidate is None and interval.absolute_diameter() > r: candidate = interval.unique_integer() try: delta = x  candidate if (delta.is_zero() or SR(delta).full_simplify().canonicalize_radical() .is_trivial_zero() or QQbar(delta).is_zero()): return candidate except (TypeError, ValueError): pass bits *= 2
 I'd also make
incremental_rounding()
private and dispense with the argument checking (and perhaps move it toreal_mpfi
if your plan is to use it from elsewhere)—but I'm okay with keeping it as it is. An advantage of making it private is that you could take the “unique rounding” function on intervals as input instead of accessing it viagetattr()
. Another option would be to introduce a common base class forFunction_floor
,Function_ceil
andFunction_round
.
 I'm a little uneasy about the changes you made to
BuiltinFunction.__call__()
. The fact that it used to convert nonElement inputs to Elements looks intentional and pretty reasonable to me. Is it really necessary to change that behavior? That being said, I'm a bit lost in the maze ofFunction.__call__
,BuiltinFunction.__call__
,_eval_
,_evalf_
,_evalf_try_
and friends, so if you tell me you are confident that the change is correct I'll trust you!
 If these changes stay, then I guess this
if module is not None: func = getattr(module, self._name, None) if func is None and self._alt_name is not None: func = getattr(module, self._name, None) ^^^^^
should be_alt_name
.
 Note that these changes also make
sage: sin(numpy.int32(0)) 0.0
which is at odds withsage: sin(ZZ(0)).parent() Integer Ring
(perhaps not ideal, but predictable at least).
 In
Function_*
, what is the point of calling_evalf_()
from_eval_()
?
 And why isn't the logic for choosing
maximum_bits
in_evalf_()
the same infloor
,ceil
andround
?
 I wouldn't bother with checking that
x
is not a relation. First,_eval_()
methods of individual functions are probably not the right place for that (eitherBuiltinFunction.__call__()
or perhaps methodsceil()
,floor()
etc. in a future subclassRelationalExpression
ofExpression
would be more reasonable). Besides, various other nonsensical inputs (e.g. series, booleans) are accepted without error, so it is a bit strange to have an ad hoc check dealing with this one.
comment:50 followups: ↓ 51 ↓ 53 Changed 3 months ago by
You may be interested in #20624. It looks like implementing _evalf_
by calling _eval_
is VERY bad: currently evaluation of ceil
may lead to running out the python call stack before doing anything useful. Obviously, this takes some time. Inheriting from BuiltinFunction
is a real bugtrap: its init reassigns a whole bunch of methods.
comment:51 in reply to: ↑ 50 Changed 3 months ago by
 Cc kcrisman. jdemeyer added; kcrisman removed
Replying to nbruin:
Inheriting from
BuiltinFunction
is a real bugtrap: its init reassigns a whole bunch of methods.
That must be the reason why Sage crashes all the time. Seriously, I agree the construction of functions is messy and it limits the developer somewhat, see "other symbolic function tickets" in http://trac.sagemath.org/wiki/symbolics/functions. Note also there are a bunch of tickets needing review there. However, I think the design is sound. You just have to read how other functions (the more recently implemented) are using it. In the end, I or someone will be transferring the Python you write to Pynac, anyway.
Cc: the author of the _evalf_try_
mechanism.
comment:52 Changed 3 months ago by
 Branch changed from u/vdelecroix/12121 to public/12121
comment:53 in reply to: ↑ 50 Changed 3 months ago by
 Commit changed from 602e5158c5f37c6900a1dee8e27a228114af1319 to 9bd70406da71c6e1083fc25cd57435788689018e
Replying to nbruin:
You may be interested in #20624. It looks like implementing
_evalf_
by calling_eval_
is VERY bad: currently evaluation ofceil
may lead to running out the python call stack before doing anything useful.
I may be mistaken but actually _eval_
calls _evalf_
here which is a completely different matter.
New commits:
9bd7040  Merge branch 'develop' into t/12121/12121

comment:54 in reply to: ↑ 49 Changed 6 weeks ago by
Replying to mmezzarobba:
On a related note, I noticed that for
round
you need to test equality with elements ofZZ + 1/2
.Couldn't you just compute ceil(x1/2)?
sage: x = 0.5 sage: print x.round() == (x0.5).ceil() False
comment:55 in reply to: ↑ 49 Changed 6 weeks ago by
Replying to mmezzarobba:
 In
Function_*
, what is the point of calling_evalf_()
from_eval_()
?
Because I want the answer of floor(pi)
to be 3
.
comment:56 Changed 6 weeks ago by
 Cc kcrisman added; kcrisman. removed
 Milestone changed from sage7.2 to sage7.3
 Status changed from needs_work to needs_review
comment:57 Changed 6 weeks ago by
 Branch changed from public/12121 to u/vdelecroix/12121
 Commit changed from 9bd70406da71c6e1083fc25cd57435788689018e to 6f392b8e7b2d562debabca7f99fa1b54180f83cb
 Description modified (diff)
comment:58 Changed 6 weeks ago by
 Reviewers changed from Marc Mezzarobba to Marc Mezzarobba, Ralf Stephan
Function
mechanics looking very good. Can't comment on the incremental_rounding()
function part.
comment:59 Changed 2 weeks ago by
 Dependencies set to #21216
 Status changed from needs_review to needs_work
comment:60 Changed 9 days ago by
 Commit changed from 6f392b8e7b2d562debabca7f99fa1b54180f83cb to f66febd75b8702831db61585d3591575d024c23b
comment:61 Changed 9 days ago by
 Dependencies #21216 deleted
 Status changed from needs_work to needs_review
Rebased on the last beta (which includes #21216).
I slightly modified incremental_rounding
. The simplification part is implemented outside. As soon as there is a reliable is_zero
available for elements of SR (as it is the case for QQbar
with is_zero
) we could make it cleaner.
Note to self (and others): we can use is_int to decide when we should test for equality. Test (once) the first time there's only one integer in the interval. If you have equality there, you're done. If not, you never will (or won't be able to prove it, anyway) and can carry on.