Opened 4 years ago

Last modified 7 days ago

#12121 needs_review defect

floor/ceil can be very slow at integral values

Reported by: dsm Owned by: AlexGhitza
Priority: major Milestone: sage-7.2
Component: basic arithmetic Keywords:
Cc: kcrisman Merged in:
Authors: Vincent Delecroix Reviewers: Marc Mezzarobba
Report Upstream: N/A Work issues:
Branch: u/vdelecroix/12121 (Commits) Commit: 313c497daaec6324dfe4ffdeb684844e301a3f61
Dependencies: Stopgaps:

Description (last modified by vdelecroix)

Reported (in slightly different form) on ask.sagemath.org:

sage: %timeit floor(log(3)/log(2))
625 loops, best of 3: 586 µs per loop
sage: %timeit floor(log(4)/log(2))
5 loops, best of 3: 3.79 s per loop

This happens because ceil and floor first try to increase the precision of a coercion of the input argument to a RealInterval by 100 bits from 53 to 20000 before finally trying a full_simplify, which succeeds. The RealInterval rounds all fail because the interval is always of the form (2 - epsilon, 2 + epsilon) and endpoints have different ceilings.

The proposal is to:

  • first found an interval of reasonably small diameter (arbitrarily set to 2-30) and see whether this is enough to decide the ceiling
  • then check equality with the only available candidate (possibly doing some simplification)
  • start further refinement of the interval

With the branch applied math.floor and numpy.floor are used directly

sage: floor(1.2r)
1.0
sage: type(_)
<type 'float'>

which is distinct from the current Sage behavior

sage: floor(1.2r)
1
sage: type(_)
<type 'sage.rings.integer.Integer'>

Change History (46)

comment:1 Changed 4 years ago by dsm

  • Description modified (diff)

comment:2 Changed 4 years ago by dsm

  • Description modified (diff)

comment:3 Changed 4 years ago by dsm

Note to self (and others): we can use is_int to decide when we should test for equality. Test (once) the first time there's only one integer in the interval. If you have equality there, you're done. If not, you never will (or won't be able to prove it, anyway) and can carry on.

comment:4 Changed 4 years ago by kini

comment:5 Changed 3 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:6 Changed 2 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:7 Changed 2 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:8 Changed 21 months ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:9 follow-up: Changed 10 months ago by zimmerma

I'm not sure if this should go to this ticket, but the following never returns:

sage: z
(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) + 1/3/(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) - 1
sage: floor(z)

Even floor(z, maximum_bits=53) loops infinitely.
Should I open a separate ticket?

comment:10 Changed 3 months ago by ajagekar.akshay

  • Branch set to u/ajagekar.akshay/Trac12121

comment:11 Changed 3 months ago by ajagekar.akshay

  • Authors set to Akshay Ajagekar
  • Commit set to 6b75d32ab5cc279a8566f0bc67acc2501bbb833e
  • Status changed from new to needs_review

New commits:

6b75d32Trac #12121: floor/ceil can be very slow at integral values

comment:12 Changed 2 months ago by ajagekar.akshay

  • Authors Akshay Ajagekar deleted
  • Branch u/ajagekar.akshay/Trac12121 deleted
  • Commit 6b75d32ab5cc279a8566f0bc67acc2501bbb833e deleted

comment:13 in reply to: ↑ 9 Changed 2 months ago by vdelecroix

Replying to zimmerma:

I'm not sure if this should go to this ticket, but the following never returns:

sage: z
(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) + 1/3/(11/9*sqrt(3)*sqrt(2) + 3)^(1/3) - 1
sage: floor(z)

Even floor(z, maximum_bits=53) loops infinitely.

Whereas the following actually works

sage: bool(z == 1)
True

Should I open a separate ticket?

I think that "very slow" includes "infinite amount of time". For me it is worth it to also fix this kind of endless loops in this ticket.

comment:14 follow-up: Changed 2 months ago by vdelecroix

  • Milestone changed from sage-6.4 to sage-7.1
  • Status changed from needs_review to needs_work

@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).

comment:15 in reply to: ↑ 14 ; follow-up: Changed 2 months ago by ajagekar.akshay

Replying to vdelecroix:

@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).

Sorry but I pushed that branch without testing, some tests failed.

comment:16 in reply to: ↑ 15 Changed 2 months ago by vdelecroix

Replying to ajagekar.akshay:

Replying to vdelecroix:

@ajagekar.akshay: why did you remove your branch? The commit lacks some examples (as the one of comment:9).

Sorry but I pushed that branch without testing, some tests failed.

That is not a problem. Tickets can have different status: closed, positive_review, needs_review, needs_work, new. If there is no branch associated to a ticket it makes no sense to keep it into "needs review" state. (I already changed it to needs_work)

comment:17 Changed 2 months ago by vdelecroix

Indeed, your code was good... and there is a bug somewhere else in the conversion from SR to the real interval fields:

sage: RealIntervalField(256)(SR(10^50 + 10^(-50))).is_int()
(True, 100000000000000000000000000000000000000000000000000)

Sorry, I was wrong. The method is_int is not intended to test if the interval is actually restricted to one integer! It only tests if the interval contains only one integer.

Last edited 2 months ago by vdelecroix (previous) (diff)

comment:18 Changed 2 months ago by vdelecroix

  • Authors set to Vincent Delecroix
  • Branch set to u/vdelecroix/12121
  • Commit set to 9821cadaa866f28a0178ce4c75b03dbc82aabd30
  • Status changed from needs_work to needs_review

New commits:

9821cadTrac 12121: fix floor/ceil functions

comment:19 Changed 2 months ago by git

  • Commit changed from 9821cadaa866f28a0178ce4c75b03dbc82aabd30 to 800d0ee77a1628f92faaf1b1e159d4aa3f1ed966

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

800d0eeTrac 12121: fix floor/ceil functions

comment:20 Changed 8 weeks ago by git

  • Commit changed from 800d0ee77a1628f92faaf1b1e159d4aa3f1ed966 to 50bdaf3b31aecc326045e48524d2aa4d98186549

Branch pushed to git repo; I updated commit sha1. New commits:

50bdaf3Trac 12121: work around a bug with symbolics

comment:21 Changed 8 weeks ago by vdelecroix

I pushed a tiny fix because we currently have

sage: log(8) / log(2) == 3
False

See the discussion at this sage-devel thread.

comment:22 Changed 6 weeks ago by mmezzarobba

#15786 is a partial duplicate. I find the fix posted there a bit cleaner, though it needs to be rebased.

Last edited 6 weeks ago by mmezzarobba (previous) (diff)

comment:23 Changed 6 weeks ago by vdelecroix

Rebased on what!?

There is one thing I am not happy with. I think we should try to make the interval much smaller than 1 in

while x_interval.absolute_diameter() >= 1:
    bits *= 2
    x_interval = RealIntervalField(bits)(x)

before trying to simplify the expression. I guess that 2-30 would be much more reasonable and avoid many attempts of (costly) simplification. What do you think?

comment:24 Changed 6 weeks ago by mmezzarobba

Sorry if I wasn't clear. I'm saying that the other branch (the one at #15786) should be rebased.

comment:25 Changed 6 weeks ago by rws

Be aware you might be using symbolics when doing bool( == ). To not end up with surprises you might want to break out the actual functionality you need out of Expression.__nonzero__ or review #16397 and use cmp.

comment:26 Changed 6 weeks ago by git

  • Commit changed from 50bdaf3b31aecc326045e48524d2aa4d98186549 to 688ebfa077f09ee9a2bc988833ddd4f0fc65a4a1

Branch pushed to git repo; I updated commit sha1. New commits:

688ebfaTrac 12121: code improvements

comment:27 Changed 6 weeks ago by vdelecroix

Replying to rws:

Be aware you might be using symbolics when doing bool( == ). To not end up with surprises you might want to break out the actual functionality you need out of Expression.__nonzero__ or review #16397 and use cmp.

You mean that there is no way to check that expr1 == expr2? I do not want to copy/paste anything from other place. If testing equality is not available, there is a big problem.

Using cmp for that purpose is a bad idea. The purpose of cmp in Python 2 is to sort things out. Not to compare.

comment:28 follow-up: Changed 6 weeks ago by rws

Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy: convert symbolic constants and function expressions to float as in N(), compare. Of course there are precision problems but that's nothing in comparison to what you get with variables.

The idea to abuse cmp is not mine. Somewhere here RLF(1) < RLF(sqrt(2)) for example, symbolic cmp is called. Should I file a bug report for such usage?

EDIT: typos

Last edited 6 weeks ago by rws (previous) (diff)

comment:29 in reply to: ↑ 28 Changed 6 weeks ago by mmezzarobba

Replying to rws:

Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy

It is undecidable even without variables.

The idea to abuse cmp is not mine. Somewhere here RLF(1) < RLF(sqrt(2)) for example, symbolic cmp is called. Should I file a bug report for such usage?

I'd say it probably is a bug. As to whether you should file a bug report, I don't know—I doubt anyone really reads them, and the issue is probably already covered somewhere in the myriad of known bugs with comparisons...

comment:30 follow-up: Changed 6 weeks ago by vdelecroix

Replying to rws:

Testing equality of symbolics in general is undecidable. If you remove all expressions with variables however, it is easy: convert symbolic constants and function expressions to float as in N(), compare. Of course there are precision problems but that's nothing in comparison to what you get with variables.

What I need is a method that either returns a reliable answer True or False or raise an error which can either be This comparison is meaningless or I don't know how to compare this.

comment:31 in reply to: ↑ 30 ; follow-up: Changed 6 weeks ago by rws

Replying to vdelecroix:

What I need is a method that either returns a reliable answer True or False or raise an error which can either be This comparison is meaningless or I don't know how to compare this.

Then bool( == ) is right for the moment and may be replaced with #19040. As said don't be surprised if it takes a long time.

comment:32 Changed 6 weeks ago by git

  • Commit changed from 688ebfa077f09ee9a2bc988833ddd4f0fc65a4a1 to a5ef94a5b27b302d813ac83538558b39a39c7267

Branch pushed to git repo; I updated commit sha1. New commits:

a5ef94aTrac 12121: note about #19040 + extra if

comment:33 in reply to: ↑ 31 Changed 6 weeks ago by vdelecroix

Replying to rws:

Replying to vdelecroix:

What I need is a method that either returns a reliable answer True or False or raise an error which can either be This comparison is meaningless or I don't know how to compare this.

Then bool( == ) is right for the moment and may be replaced with #19040. As said don't be surprised if it takes a long time.

I added a note about #19040.

comment:34 Changed 6 weeks ago by vdelecroix

  • Description modified (diff)
  • Milestone changed from sage-7.1 to sage-7.2

comment:35 Changed 6 weeks ago by mmezzarobba

  • Reviewers set to Marc Mezzarobba
  • Status changed from needs_review to needs_work

Hi Vincent,

Sorry for my unclear comments above, I didn't look closely enough at your code before posting them.

Beside the issue with comparisons raised by Ralf, I find the code on your branch a bit complicated, and I don't like the fact that you drop the maximum_bits parameters at the risk of looping forever if you cannot prove that the input is an integer. I pushed to u/mmezzarobba/12121-ceil a rough attempt to fix these issues (in the case of ceil only at the moment, and not well tested yet), please tell me what you think of it.

comment:36 Changed 4 weeks ago by git

  • Commit changed from a5ef94a5b27b302d813ac83538558b39a39c7267 to 60f0c291adf28344156661ea7623c5721ed4a06b

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

60f0c29Trac 12121: Fix floor/ceil

comment:37 Changed 4 weeks ago by vdelecroix

  • Status changed from needs_work to needs_review

All right. I factorized the two implementations in a new function incremental_rounding. It is cleaner and the parameter maximum_bits is reintroduced.

comment:38 Changed 4 weeks ago by git

  • Commit changed from 60f0c291adf28344156661ea7623c5721ed4a06b to d53c406171d73a0138d25336732728b41409c8b9

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

d53c406Trac 12121: Fix floor/ceil

comment:39 Changed 4 weeks ago by mmezzarobba

I fear I won't have time to review your new version in the next few days at least, but from a quick look at it there are a lot of things I don't understand. In no particular order:

  • why do you make maximum_bits an Integer?
  • what don't you like about unique_integer()?
  • is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?
  • why do you insist on using == on symbolic expressions instead of is_trivial_zero()?
  • are you sure you want to raise an error when maximum_bits does not suffice to conclude? this is a symbolic function that may be buried deep in the middle of a symbolic expression; returning unevaluated seems more reasonable to me...
  • do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

comment:40 Changed 4 weeks ago by git

  • Commit changed from d53c406171d73a0138d25336732728b41409c8b9 to 0e9b2f21d6a6bc42e35302316ac0f6b3b7451450

Branch pushed to git repo; I updated commit sha1. New commits:

0e9b2f2Trac 12121: remove __call__ and fix round

comment:41 follow-up: Changed 4 weeks ago by vdelecroix

I removed the __call__ in both Function_floor and Function_ceil. The code is now much simpler. Though there was some adaptation needed in symbolic/expression.pyx.

Replying to mmezzarobba:

I fear I won't have time to review your new version in the next few days at least, but from a quick look at it there are a lot of things I don't understand. In no particular order:

  • why do you make maximum_bits an Integer?

all right. int is fine as well.

  • what don't you like about unique_integer()?

an assert does not cost anything. And unique_integer silently fails if the interval does not enclose a unique integer.

  • is it really better to have an absolute bound for the diameter that makes us suspect we found an exact integer, rather than something that depends on the precision?

the precision of what? there is the field used for the evaluation which is different from the diameter of the interval. If you have more than one integer in your interval which one are you using to test equality?

  • why do you insist on using == on symbolic expressions instead of is_trivial_zero()?

Because I want to check equality with an integer. Not if it is a trivial equality.

  • are you sure you want to raise an error when maximum_bits does not suffice to conclude? this is a symbolic function that may be buried deep in the middle of a symbolic expression; returning unevaluated seems more reasonable to me...

Done with an example.

  • do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version. On a related note, I noticed that for round you need to test equality with elements of ZZ + 1/2.

comment:42 in reply to: ↑ 41 Changed 4 weeks ago by vdelecroix

Replying to vdelecroix:

Replying to mmezzarobba:

  • do we really need two loops that do essentially the same thing (including raising errors with the exact same message)?

The equality test is potentially costly. And we want to avoid it as much as possible. In particular, it makes no sense to test this equality within each step of the loop as it is in your version. On a related note, I noticed that for round you need to test equality with elements of ZZ + 1/2.

And I also would like to use the very same function incremental_rounding for elements of QQbar. For the very same reason, you only want very lately the equality test.

comment:43 Changed 4 weeks ago by git

  • Commit changed from 0e9b2f21d6a6bc42e35302316ac0f6b3b7451450 to 313c497daaec6324dfe4ffdeb684844e301a3f61

Branch pushed to git repo; I updated commit sha1. New commits:

313c497Trac 12121: fix doctests

comment:44 Changed 4 weeks ago by vdelecroix

  • Description modified (diff)

comment:45 Changed 7 days ago by vdelecroix

ping?!

comment:46 Changed 7 days ago by mmezzarobba

Sorry, I have about zero time for Sage development before at least 1-2 weeks. All I can say is that I wasn't completely convinced by your answers and would need to think things over more carefully. If someone wants to review the ticket in the meantime, please do.

Note: See TracTickets for help on using tickets.