Opened 4 years ago

# floor/ceil can be very slow at integral values

Reported by: Owned by: dsm AlexGhitza major sage-7.2 basic arithmetic kcrisman., jdemeyer Vincent Delecroix Marc Mezzarobba N/A u/vdelecroix/12121 (Commits) 602e5158c5f37c6900a1dee8e27a228114af1319

Reported (in slightly different form) on ask.sagemath.org:

```sage: %timeit floor(log(3)/log(2))
625 loops, best of 3: 586 µs per loop
sage: %timeit floor(log(4)/log(2))
5 loops, best of 3: 3.79 s per loop
```

This happens because ceil and floor first try to increase the precision of a coercion of the input argument to a RealInterval by 100 bits from 53 to 20000 before finally trying a full_simplify, which succeeds. The RealInterval rounds all fail because the interval is always of the form (2 - epsilon, 2 + epsilon) and endpoints have different ceilings.

The proposal is to:

• first found an interval of reasonably small diameter (arbitrarily set to 2-30) and see whether this is enough to decide the ceiling
• then check equality with the only available candidate (possibly doing some simplification)
• start further refinement of the interval

With the branch applied math.floor and numpy.floor are used directly

```sage: floor(1.2r)
1.0
sage: type(_)
<type 'float'>
```

which is distinct from the current Sage behavior

```sage: floor(1.2r)
1
sage: type(_)
<type 'sage.rings.integer.Integer'>
```

### comment:1 Changed 4 years ago by dsm

• Description modified (diff)

### comment:2 Changed 4 years ago by dsm

• Description modified (diff)

### comment:3 Changed 4 years ago by dsm

Note to self (and others): we can use is_int to decide when we should test for equality. Test (once) the first time there's only one integer in the interval. If you have equality there, you're done. If not, you never will (or won't be able to prove it, anyway) and can carry on.

### comment:5 Changed 3 years ago by jdemeyer

• Milestone changed from sage-5.11 to sage-5.12

### comment:6 Changed 2 years ago by vbraun_spam

• Milestone changed from sage-6.1 to sage-6.2

### comment:7 Changed 2 years ago by vbraun_spam

• Milestone changed from sage-6.2 to sage-6.3

### comment:8 Changed 22 months ago by vbraun_spam

• Milestone changed from sage-6.3 to sage-6.4

### comment:9 follow-up: ↓ 13 Changed 11 months ago by zimmerma

I'm not sure if this should go to this ticket, but the following never returns:

```sage: z
(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) + 1/3/(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) - 1
sage: floor(z)
```

Even floor(z, maximum_bits=53) loops infinitely.
Should I open a separate ticket?

### comment:10 Changed 3 months ago by ajagekar.akshay

• Branch set to u/ajagekar.akshay/Trac12121

### comment:11 Changed 3 months ago by ajagekar.akshay

• Authors set to Akshay Ajagekar
• Commit set to 6b75d32ab5cc279a8566f0bc67acc2501bbb833e
• Status changed from new to needs_review

New commits:

 ​6b75d32 Trac #12121: floor/ceil can be very slow at integral values

### comment:12 Changed 3 months ago by ajagekar.akshay

• Authors Akshay Ajagekar deleted
• Branch u/ajagekar.akshay/Trac12121 deleted
• Commit 6b75d32ab5cc279a8566f0bc67acc2501bbb833e deleted

### comment:13 in reply to: ↑ 9 Changed 3 months ago by vdelecroix

I'm not sure if this should go to this ticket, but the following never returns:

```sage: z
(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) + 1/3/(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) - 1
sage: floor(z)
```

Even floor(z, maximum_bits=53) loops infinitely.

Whereas the following actually works

```sage: bool(z == 1)
True
```

Should I open a separate ticket?

I think that "very slow" includes "infinite amount of time". For me it is worth it to also fix this kind of endless loops in this ticket.

### comment:14 follow-up: ↓ 15 Changed 3 months ago by vdelecroix

• Milestone changed from sage-6.4 to sage-7.1
• Status changed from needs_review to needs_work

@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).

### comment:15 in reply to: ↑ 14 ; follow-up: ↓ 16 Changed 3 months ago by ajagekar.akshay

@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).

Sorry but I pushed that branch without testing, some tests failed.

### comment:16 in reply to: ↑ 15 Changed 3 months ago by vdelecroix

@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).

Sorry but I pushed that branch without testing, some tests failed.

That is not a problem. Tickets can have different status: closed, positive_review, needs_review, needs_work, new. If there is no branch associated to a ticket it makes no sense to keep it into "needs review" state. (I already changed it to needs_work)

### comment:17 Changed 3 months ago by vdelecroix

Indeed, your code was good... and there is a bug somewhere else in the conversion from SR to the real interval fields:

```sage: RealIntervalField(256)(SR(10^50 + 10^(-50))).is_int()
(True, 100000000000000000000000000000000000000000000000000)
```

Sorry, I was wrong. The method is_int is not intended to test if the interval is actually restricted to one integer! It only tests if the interval contains only one integer.

Last edited 3 months ago by vdelecroix (previous) (diff)

### comment:18 Changed 3 months ago by vdelecroix

• Authors set to Vincent Delecroix
• Branch set to u/vdelecroix/12121
• Status changed from needs_work to needs_review

New commits:

 ​9821cad Trac 12121: fix floor/ceil functions

### comment:19 Changed 3 months ago by git

• Commit changed from 9821cadaa866f28a0178ce4c75b03dbc82aabd30 to 800d0ee77a1628f92faaf1b1e159d4aa3f1ed966

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​800d0ee Trac 12121: fix floor/ceil functions

### comment:20 Changed 3 months ago by git

• Commit changed from 800d0ee77a1628f92faaf1b1e159d4aa3f1ed966 to 50bdaf3b31aecc326045e48524d2aa4d98186549

Branch pushed to git repo; I updated commit sha1. New commits:

 ​50bdaf3 Trac 12121: work around a bug with symbolics

### comment:21 Changed 3 months ago by vdelecroix

I pushed a tiny fix because we currently have

```sage: log(8) / log(2) == 3
False
```

See the discussion at this sage-devel thread.

### comment:22 Changed 2 months ago by mmezzarobba

#15786 is a partial duplicate. I find the fix posted there a bit cleaner, though it needs to be rebased.

Last edited 2 months ago by mmezzarobba (previous) (diff)

### comment:23 Changed 2 months ago by vdelecroix

Rebased on what!?

There is one thing I am not happy with. I think we should try to make the interval much smaller than 1 in

```while x_interval.absolute_diameter() >= 1:
bits *= 2
x_interval = RealIntervalField(bits)(x)
```

before trying to simplify the expression. I guess that 2-30 would be much more reasonable and avoid many attempts of (costly) simplification. What do you think?

### comment:24 Changed 2 months ago by mmezzarobba

Sorry if I wasn't clear. I'm saying that the other branch (the one at #15786) should be rebased.

### comment:25 Changed 2 months ago by rws

Be aware you might be using symbolics when doing bool( == ). To not end up with surprises you might want to break out the actual functionality you need out of Expression.__nonzero__ or review #16397 and use cmp.

### comment:26 Changed 2 months ago by git

• Commit changed from 50bdaf3b31aecc326045e48524d2aa4d98186549 to 688ebfa077f09ee9a2bc988833ddd4f0fc65a4a1

Branch pushed to git repo; I updated commit sha1. New commits:

 ​688ebfa Trac 12121: code improvements

### comment:27 Changed 2 months ago by vdelecroix

Be aware you might be using symbolics when doing bool( == ). To not end up with surprises you might want to break out the actual functionality you need out of Expression.__nonzero__ or review #16397 and use cmp.

You mean that there is no way to check that expr1 == expr2? I do not want to copy/paste anything from other place. If testing equality is not available, there is a big problem.

Using cmp for that purpose is a bad idea. The purpose of cmp in Python 2 is to sort things out. Not to compare.

### comment:28 follow-up: ↓ 29 Changed 2 months ago by rws

Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy: convert symbolic constants and function expressions to float as in N(), compare. Of course there are precision problems but that's nothing in comparison to what you get with variables.

The idea to abuse cmp is not mine. Somewhere here RLF(1) < RLF(sqrt(2)) for example, symbolic cmp is called. Should I file a bug report for such usage?

EDIT: typos

Last edited 2 months ago by rws (previous) (diff)

### comment:29 in reply to: ↑ 28 Changed 2 months ago by mmezzarobba

Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy

It is undecidable even without variables.

The idea to abuse cmp is not mine. Somewhere here RLF(1) < RLF(sqrt(2)) for example, symbolic cmp is called. Should I file a bug report for such usage?

I'd say it probably is a bug. As to whether you should file a bug report, I don't know—I doubt anyone really reads them, and the issue is probably already covered somewhere in the myriad of known bugs with comparisons...

### comment:30 follow-up: ↓ 31 Changed 2 months ago by vdelecroix

Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy: convert symbolic constants and function expressions to float as in N(), compare. Of course there are precision problems but that's nothing in comparison to what you get with variables.

What I need is a method that either returns a reliable answer True or False or raise an error which can either be This comparison is meaningless or I don't know how to compare this.

### comment:31 in reply to: ↑ 30 ; follow-up: ↓ 33 Changed 2 months ago by rws

What I need is a method that either returns a reliable answer True or False or raise an error which can either be This comparison is meaningless or I don't know how to compare this.

Then bool( == ) is right for the moment and may be replaced with #19040. As said don't be surprised if it takes a long time.

### comment:32 Changed 2 months ago by git

• Commit changed from 688ebfa077f09ee9a2bc988833ddd4f0fc65a4a1 to a5ef94a5b27b302d813ac83538558b39a39c7267

Branch pushed to git repo; I updated commit sha1. New commits:

 ​a5ef94a Trac 12121: note about #19040 + extra if

### comment:33 in reply to: ↑ 31 Changed 2 months ago by vdelecroix

What I need is a method that either returns a reliable answer True or False or raise an error which can either be This comparison is meaningless or I don't know how to compare this.

Then bool( == ) is right for the moment and may be replaced with #19040. As said don't be surprised if it takes a long time.

### comment:34 Changed 2 months ago by vdelecroix

• Description modified (diff)
• Milestone changed from sage-7.1 to sage-7.2

### comment:35 Changed 2 months ago by mmezzarobba

• Reviewers set to Marc Mezzarobba
• Status changed from needs_review to needs_work

Hi Vincent,

Sorry for my unclear comments above, I didn't look closely enough at your code before posting them.

Beside the issue with comparisons raised by Ralf, I find the code on your branch a bit complicated, and I don't like the fact that you drop the maximum_bits parameters at the risk of looping forever if you cannot prove that the input is an integer. I pushed to u/mmezzarobba/12121-ceil a rough attempt to fix these issues (in the case of ceil only at the moment, and not well tested yet), please tell me what you think of it.

### comment:36 Changed 7 weeks ago by git

• Commit changed from a5ef94a5b27b302d813ac83538558b39a39c7267 to 60f0c291adf28344156661ea7623c5721ed4a06b

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​60f0c29 Trac 12121: Fix floor/ceil

### comment:37 Changed 7 weeks ago by vdelecroix

• Status changed from needs_work to needs_review

All right. I factorized the two implementations in a new function incremental_rounding. It is cleaner and the parameter maximum_bits is reintroduced.

### comment:38 Changed 7 weeks ago by git

• Commit changed from 60f0c291adf28344156661ea7623c5721ed4a06b to d53c406171d73a0138d25336732728b41409c8b9

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​d53c406 Trac 12121: Fix floor/ceil

### comment:39 Changed 7 weeks ago by mmezzarobba

I fear I won't have time to review your new version in the next few days at least, but from a quick look at it there are a lot of things I don't understand. In no particular order:

• why do you make maximum_bits an Integer?
• what don't you like about unique_integer()?
• is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?
• why do you insist on using == on symbolic expressions instead of is_trivial_zero()?
• are you sure you want to raise an error when maximum_bits does not suffice to conclude? this is a symbolic function that may be buried deep in the middle of a symbolic expression; returning unevaluated seems more reasonable to me...
• do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

### comment:40 Changed 7 weeks ago by git

• Commit changed from d53c406171d73a0138d25336732728b41409c8b9 to 0e9b2f21d6a6bc42e35302316ac0f6b3b7451450

Branch pushed to git repo; I updated commit sha1. New commits:

 ​0e9b2f2 Trac 12121: remove __call__ and fix round

### comment:41 follow-ups: ↓ 42 ↓ 49 Changed 7 weeks ago by vdelecroix

I removed the __call__ in both Function_floor and Function_ceil. The code is now much simpler. Though there was some adaptation needed in symbolic/expression.pyx.

I fear I won't have time to review your new version in the next few days at least, but from a quick look at it there are a lot of things I don't understand. In no particular order:

• why do you make maximum_bits an Integer?

all right. int is fine as well.

• what don't you like about unique_integer()?

an assert does not cost anything. And unique_integer silently fails if the interval does not enclose a unique integer.

• is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?

the precision of what? there is the field used for the evaluation which is different from the diameter of the interval. If you have more than one integer in your interval which one are you using to test equality?

• why do you insist on using == on symbolic expressions instead of is_trivial_zero()?

Because I want to check equality with an integer. Not if it is a trivial equality.

• are you sure you want to raise an error when maximum_bits does not suffice to conclude? this is a symbolic function that may be buried deep in the middle of a symbolic expression; returning unevaluated seems more reasonable to me...

Done with an example.

• do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version. On a related note, I noticed that for round you need to test equality with elements of ZZ + 1/2.

### comment:42 in reply to: ↑ 41 Changed 7 weeks ago by vdelecroix

• do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version. On a related note, I noticed that for round you need to test equality with elements of ZZ + 1/2.

And I also would like to use the very same function incremental_rounding for elements of QQbar. For the very same reason, you only want very lately the equality test.

### comment:43 Changed 7 weeks ago by git

• Commit changed from 0e9b2f21d6a6bc42e35302316ac0f6b3b7451450 to 313c497daaec6324dfe4ffdeb684844e301a3f61

Branch pushed to git repo; I updated commit sha1. New commits:

 ​313c497 Trac 12121: fix doctests

### comment:44 Changed 7 weeks ago by vdelecroix

• Description modified (diff)

ping?!

### comment:46 Changed 4 weeks ago by mmezzarobba

Sorry, I have about zero time for Sage development before at least 1-2 weeks. All I can say is that I wasn't completely convinced by your answers and would need to think things over more carefully. If someone wants to review the ticket in the meantime, please do.

### comment:47 Changed 2 weeks ago by git

• Commit changed from 313c497daaec6324dfe4ffdeb684844e301a3f61 to 602e5158c5f37c6900a1dee8e27a228114af1319

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

 ​9bfef80 Trac 12121: Fix floor/ceil ​c8abe9c Trac 12121: remove __call__ and fix round ​602e515 Trac 12121: fix doctests

### comment:48 Changed 2 weeks ago by vdelecroix

rebased on 7-2.rc1

### comment:49 in reply to: ↑ 41 Changed 7 days ago by mmezzarobba

• Status changed from needs_review to needs_work

Okay, I'm back. Sorry that it took so long.

• what don't you like about unique_integer()?

an assert does not cost anything. And unique_integer silently fails if the interval does not enclose a unique integer.

Uh? No, it doesn't.

• is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?

the precision of what? there is the field used for the evaluation which is different from the diameter of the interval. If you have more than one integer in your interval which one are you using to test equality?

The precision of the interval computation, i.e. bits. The idea being that if we found an interval of width (say) 2⁻²⁰ containing an integer by computing with 1000 bits of precision, we may want to see if the width of the interval keeps decreasing when the precision increases and only run the symbolic part of the algorithm if that's the case. But I agree that this is a very minor issue at best.

• why do you insist on using == on symbolic expressions instead of is_trivial_zero()?

Because I want to check equality with an integer. Not if it is a trivial equality.

I mean using is_trivial_zero() after calling full_simplify() & co, as in my version: this is safer than relying on == and essentially as powerful.

• do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version.

Yes—as I said, my version was just a rough sketch of the changes I'd b tempted to make, nothing finished. As the code I was starting with used to loop forever in the typical situation where mine would do the symbolic test repeatedly (that is, x ∈ ℤ integer but we don't manage to prove it), I thought the additional cost would be acceptable. ;-) But anyway, it is not hard to do the test only once while avoiding the code duplication.

On a related note, I noticed that for round you need to test equality with elements of ZZ + 1/2.

Couldn't you just compute ceil(x-1/2)?

Now for some comments on the current code:

• To summarize the above, I still think the main logic in incremental_rounding() could be shortened to something like (not tested):
```    unique_rounding = getattr(RealIntervalFieldElement, 'unique_' + mode)
r = RR.one() >> 20

bits = 64
candidate = None
while bits < maximum_bits:
interval = RealIntervalField(bits)(x) # may raise a TypeError
try:
return unique_rounding(interval)
except ValueError:
pass
if candidate is None and interval.absolute_diameter() > r:
candidate = interval.unique_integer()
try:
delta = x - candidate
if (delta.is_zero()
.is_trivial_zero()
or QQbar(delta).is_zero()):
return candidate
except (TypeError, ValueError):
pass
bits *= 2
```
• I'd also make incremental_rounding() private and dispense with the argument checking (and perhaps move it to real_mpfi if your plan is to use it from elsewhere)—but I'm okay with keeping it as it is. An advantage of making it private is that you could take the “unique rounding” function on intervals as input instead of accessing it via getattr(). Another option would be to introduce a common base class for Function_floor, Function_ceil and Function_round.
• I'm a little uneasy about the changes you made to BuiltinFunction.__call__(). The fact that it used to convert non-Element inputs to Elements looks intentional and pretty reasonable to me. Is it really necessary to change that behavior? That being said, I'm a bit lost in the maze of Function.__call__, BuiltinFunction.__call__, _eval_, _evalf_, _evalf_try_ and friends, so if you tell me you are confident that the change is correct I'll trust you!
• If these changes stay, then I guess this
```if module is not None:
func = getattr(module, self._name, None)
if func is None and self._alt_name is not None:
func = getattr(module, self._name, None)
^^^^^
```
should be _alt_name.
• Note that these changes also make
```sage: sin(numpy.int32(0))
0.0
```
which is at odds with
```sage: sin(ZZ(0)).parent()
Integer Ring
```
(perhaps not ideal, but predictable at least).
• In Function_*, what is the point of calling _evalf_() from _eval_()?
• And why isn't the logic for choosing maximum_bits in _evalf_() the same in floor, ceil and round?
• I wouldn't bother with checking that x is not a relation. First, _eval_() methods of individual functions are probably not the right place for that (either BuiltinFunction.__call__() or perhaps methods ceil(), floor() etc. in a future subclass RelationalExpression of Expression would be more reasonable). Besides, various other nonsensical inputs (e.g. series, booleans) are accepted without error, so it is a bit strange to have an ad hoc check dealing with this one.

### comment:50 follow-up: ↓ 51 Changed 5 days ago by nbruin

You may be interested in #20624. It looks like implementing _evalf_ by calling _eval_ is VERY bad: currently evaluation of ceil may lead to running out the python call stack before doing anything useful. Obviously, this takes some time. Inheriting from BuiltinFunction is a real bugtrap: its init reassigns a whole bunch of methods.

### comment:51 in reply to: ↑ 50 Changed 5 days ago by rws

• Cc kcrisman. jdemeyer added; kcrisman removed