Opened 4 years ago

Closed 3 years ago

#17808 closed enhancement (fixed)

Preparse old-style octals as strings

Reported by: jdemeyer Owned by:
Priority: major Milestone: sage-7.3
Component: python3 Keywords:
Cc: Merged in:
Authors: Jeroen Demeyer Reviewers: Frédéric Chapoton
Report Upstream: N/A Work issues:
Branch: 52fd0ce (Commits) Commit: 52fd0ce06e69b95582edf1a1401159cb44c96430
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

To solve #17807, we preparse 0100 as Integer('0100') instead of Integer(0100): thanks to #17413, this will give a deprecation warning so that users should know something funny is going on:

sage: 0100
/usr/local/src/sage-git/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py:2883: DeprecationWarning: use 0o as octal prefix instead of 0
If you do not want this number to be interpreted as octal, remove the leading zeros.
See http://trac.sagemath.org/17413 for details.
  exec(code_obj, self.user_global_ns, self.user_ns)
64

We also preparse new-style binary/octal/hex strings as integers instead of strings, which should be faster, see 5.

Change History (22)

comment:1 Changed 4 years ago by jdemeyer

  • Description modified (diff)

comment:2 Changed 4 years ago by jdemeyer

  • Description modified (diff)

comment:3 Changed 4 years ago by jdemeyer

  • Branch set to u/jdemeyer/ticket/17808
  • Created changed from 02/19/15 08:31:31 to 02/19/15 08:31:31
  • Modified changed from 02/19/15 08:36:04 to 02/19/15 08:36:04

comment:4 Changed 4 years ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Commit set to 533af4ce9ae65f71d38177052fccb28cc69234e7
  • Status changed from new to needs_review

New commits:

533af4cPreparse integers as strings

comment:5 follow-up: Changed 4 years ago by nbruin

I would think the slow-down for small integers makes the proposed changes very unattractive. The whole idea of preparsing 100 as Integer(100) is that in the AST and the generated byte code, the constant 100 is already stored as a numerical one, and hence conversion is much faster.

For instance, in code like:

sage: %timeit L=[Integer('1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000') for i in range(10000r)]
100 loops, best of 3: 11.4 ms per loop
sage: %timeit L=[Integer(1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000) for i in range(10000r)]
100 loops, best of 3: 4.66 ms per loop

you really start to notice this.

preparse_file mitigates issues like this a little bit by factoring out numeric constants, so that their conversion doesn't happen in inner loops.

If you want to store your numerical constants as strings in the byte code and have fast conversion, you should probably store them in hex. It's a little more work in the preparser, but the result is more compact and conversion to a numerical constant is truly linear in number of bits.

comment:6 in reply to: ↑ 5 ; follow-up: Changed 4 years ago by jdemeyer

Replying to nbruin:

The whole idea of preparsing 100 as Integer(100) is that in the AST and the generated byte code, the constant 100 is already stored as a numerical one

Is that really important? The "timeit" example is of course a bit artificial.

comment:7 in reply to: ↑ 6 Changed 4 years ago by nbruin

Replying to jdemeyer:

Replying to nbruin:

The whole idea of preparsing 100 as Integer(100) is that in the AST and the generated byte code, the constant 100 is already stored as a numerical one

Is that really important? The "timeit" example is of course a bit artificial.

I think so. Storing the numerical constant as a string forces the decimal-to-binary conversion to happen at runtime. If you just write it as a python integer, the conversion happens at parsing, so by the time it's an ast and the compiler looks at it, you're already dealing with an integer.

Unfortunately, python doesn't permit compile-time macros (perhaps if we're ever going to rewrite our preparser to properly parse, we can combine it with MacroPy and then we can have Integer object creation at compile time)

I am fairly certain that nearly all code that gets run with sage has TONS of small integer literals in it. Probably for many applications, people would get better performance writing 100r, but of course nobody will. This ticket would deteriorate the much more common situation in favour of a small gain in the extremely rare situation that someone wants to include an incredibly large integer literal in their code (talk about artificial--I don't think having a (small) numerical literal in an inner loop is artificial at all!). And the extremely rare situation can be solved by writing quotes already (and by the time you're writing such a long string you might as well convert it to hex anyway, in which case Python and GMP are (at least asymptotically) comparable.

If you use preparse_file instead of preparse, you'll see that numerical constants get pushed outside (which has its own set of problems, see #11542), which mitigates the string-conversion-in-inner-loop.

comment:8 Changed 4 years ago by jdemeyer

  • Status changed from needs_review to needs_work

Fine, you convinced me.

comment:9 Changed 4 years ago by jdemeyer

  • Description modified (diff)
  • Summary changed from Preparse integers as strings to Preparse old-style octals as strings

comment:10 Changed 4 years ago by jdemeyer

  • Description modified (diff)

comment:11 Changed 4 years ago by git

  • Commit changed from 533af4ce9ae65f71d38177052fccb28cc69234e7 to f72c449a23193744f4b9485a57b550fab2201b4f

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

f72c449Preparse old-style octal numbers as strings

comment:12 Changed 4 years ago by jdemeyer

  • Status changed from needs_work to needs_review

comment:13 Changed 4 years ago by chapoton

  • Branch changed from u/jdemeyer/ticket/17808 to public/ticket/17808
  • Commit changed from f72c449a23193744f4b9485a57b550fab2201b4f to b469c4c747eefc73af5ac43818d933b7c28fe889
  • Milestone changed from sage-6.6 to sage-6.7

New commits:

293c474Merge branch 'u/jdemeyer/ticket/17808' into 6.7.b1
b469c4ctrac #17808 taking care of pt tutorial

comment:14 Changed 3 years ago by jdemeyer

  • Component changed from misc to python3

comment:15 Changed 3 years ago by git

  • Commit changed from b469c4c747eefc73af5ac43818d933b7c28fe889 to 52fd0ce06e69b95582edf1a1401159cb44c96430

Branch pushed to git repo; I updated commit sha1. New commits:

b243aa3Merge branch 'public/ticket/17808' into 7.2.b6
52fd0cetrac #17808 handling es and ja tutorial

comment:16 Changed 3 years ago by chapoton

  • Milestone changed from sage-6.7 to sage-7.2

comment:17 follow-up: Changed 3 years ago by chapoton

  • Milestone changed from sage-7.2 to sage-7.3

is the title still acurately describing the content of the ticket ?

comment:18 in reply to: ↑ 17 Changed 3 years ago by jdemeyer

Replying to chapoton:

is the title still acurately describing the content of the ticket ?

Yes.

comment:19 Changed 3 years ago by jdemeyer

  • Description modified (diff)

Although the ticket does change a little bit more than that, I added a sentence to the description.

comment:20 Changed 3 years ago by jdemeyer

  • Description modified (diff)

comment:21 Changed 3 years ago by chapoton

  • Reviewers set to Frédéric Chapoton
  • Status changed from needs_review to positive_review

ok, looks good to me

comment:22 Changed 3 years ago by vbraun

  • Branch changed from public/ticket/17808 to 52fd0ce06e69b95582edf1a1401159cb44c96430
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.