Opened 5 years ago
Closed 3 years ago
#17808 closed enhancement (fixed)
Preparse oldstyle octals as strings
Reported by:  jdemeyer  Owned by:  

Priority:  major  Milestone:  sage7.3 
Component:  python3  Keywords:  
Cc:  Merged in:  
Authors:  Jeroen Demeyer  Reviewers:  Frédéric Chapoton 
Report Upstream:  N/A  Work issues:  
Branch:  52fd0ce (Commits)  Commit:  52fd0ce06e69b95582edf1a1401159cb44c96430 
Dependencies:  Stopgaps: 
Description (last modified by )
To solve #17807, we preparse 0100
as Integer('0100')
instead of Integer(0100)
: thanks to #17413, this will give a deprecation warning so that users should know something funny is going on:
sage: 0100 /usr/local/src/sagegit/local/lib/python2.7/sitepackages/IPython/core/interactiveshell.py:2883: DeprecationWarning: use 0o as octal prefix instead of 0 If you do not want this number to be interpreted as octal, remove the leading zeros. See http://trac.sagemath.org/17413 for details. exec(code_obj, self.user_global_ns, self.user_ns) 64
We also preparse newstyle binary/octal/hex strings as integers instead of strings, which should be faster, see 5.
Change History (22)
comment:1 Changed 5 years ago by
 Description modified (diff)
comment:2 Changed 5 years ago by
 Description modified (diff)
comment:3 Changed 5 years ago by
 Branch set to u/jdemeyer/ticket/17808
 Created changed from 02/19/15 08:31:31 to 02/19/15 08:31:31
 Modified changed from 02/19/15 08:36:04 to 02/19/15 08:36:04
comment:4 Changed 5 years ago by
 Commit set to 533af4ce9ae65f71d38177052fccb28cc69234e7
 Status changed from new to needs_review
comment:5 followup: ↓ 6 Changed 5 years ago by
I would think the slowdown for small integers makes the proposed changes very unattractive. The whole idea of preparsing 100
as Integer(100)
is that in the AST and the generated byte code, the constant 100
is already stored as a numerical one, and hence conversion is much faster.
For instance, in code like:
sage: %timeit L=[Integer('1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000') for i in range(10000r)] 100 loops, best of 3: 11.4 ms per loop sage: %timeit L=[Integer(1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000) for i in range(10000r)] 100 loops, best of 3: 4.66 ms per loop
you really start to notice this.
preparse_file mitigates issues like this a little bit by factoring out numeric constants, so that their conversion doesn't happen in inner loops.
If you want to store your numerical constants as strings in the byte code and have fast conversion, you should probably store them in hex. It's a little more work in the preparser, but the result is more compact and conversion to a numerical constant is truly linear in number of bits.
comment:6 in reply to: ↑ 5 ; followup: ↓ 7 Changed 5 years ago by
Replying to nbruin:
The whole idea of preparsing
100
asInteger(100)
is that in the AST and the generated byte code, the constant100
is already stored as a numerical one
Is that really important? The "timeit" example is of course a bit artificial.
comment:7 in reply to: ↑ 6 Changed 5 years ago by
Replying to jdemeyer:
Replying to nbruin:
The whole idea of preparsing
100
asInteger(100)
is that in the AST and the generated byte code, the constant100
is already stored as a numerical oneIs that really important? The "timeit" example is of course a bit artificial.
I think so. Storing the numerical constant as a string forces the decimaltobinary conversion to happen at runtime. If you just write it as a python integer, the conversion happens at parsing, so by the time it's an ast and the compiler looks at it, you're already dealing with an integer.
Unfortunately, python doesn't permit compiletime macros (perhaps if we're ever going to rewrite our preparser to properly parse, we can combine it with MacroPy and then we can have Integer
object creation at compile time)
I am fairly certain that nearly all code that gets run with sage has TONS of small integer literals in it. Probably for many applications, people would get better performance writing 100r
, but of course nobody will. This ticket would deteriorate the much more common situation in favour of a small gain in the extremely rare situation that someone wants to include an incredibly large integer literal in their code (talk about artificialI don't think having a (small) numerical literal in an inner loop is artificial at all!). And the extremely rare situation can be solved by writing quotes already (and by the time you're writing such a long string you might as well convert it to hex anyway, in which case Python and GMP are (at least asymptotically) comparable.
If you use preparse_file
instead of preparse
, you'll see that numerical constants get pushed outside (which has its own set of problems, see #11542), which mitigates the stringconversionininnerloop.
comment:8 Changed 5 years ago by
 Status changed from needs_review to needs_work
Fine, you convinced me.
comment:9 Changed 5 years ago by
 Description modified (diff)
 Summary changed from Preparse integers as strings to Preparse oldstyle octals as strings
comment:10 Changed 5 years ago by
 Description modified (diff)
comment:11 Changed 5 years ago by
 Commit changed from 533af4ce9ae65f71d38177052fccb28cc69234e7 to f72c449a23193744f4b9485a57b550fab2201b4f
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
f72c449  Preparse oldstyle octal numbers as strings

comment:12 Changed 5 years ago by
 Status changed from needs_work to needs_review
comment:13 Changed 5 years ago by
 Branch changed from u/jdemeyer/ticket/17808 to public/ticket/17808
 Commit changed from f72c449a23193744f4b9485a57b550fab2201b4f to b469c4c747eefc73af5ac43818d933b7c28fe889
 Milestone changed from sage6.6 to sage6.7
comment:14 Changed 3 years ago by
 Component changed from misc to python3
comment:15 Changed 3 years ago by
 Commit changed from b469c4c747eefc73af5ac43818d933b7c28fe889 to 52fd0ce06e69b95582edf1a1401159cb44c96430
comment:16 Changed 3 years ago by
 Milestone changed from sage6.7 to sage7.2
comment:17 followup: ↓ 18 Changed 3 years ago by
 Milestone changed from sage7.2 to sage7.3
is the title still acurately describing the content of the ticket ?
comment:18 in reply to: ↑ 17 Changed 3 years ago by
comment:19 Changed 3 years ago by
 Description modified (diff)
Although the ticket does change a little bit more than that, I added a sentence to the description.
comment:20 Changed 3 years ago by
 Description modified (diff)
comment:21 Changed 3 years ago by
 Reviewers set to Frédéric Chapoton
 Status changed from needs_review to positive_review
ok, looks good to me
comment:22 Changed 3 years ago by
 Branch changed from public/ticket/17808 to 52fd0ce06e69b95582edf1a1401159cb44c96430
 Resolution set to fixed
 Status changed from positive_review to closed
New commits:
Preparse integers as strings