Opened 8 years ago
Closed 7 years ago
#17808 closed enhancement (fixed)
Preparse oldstyle octals as strings
Reported by:  Jeroen Demeyer  Owned by:  

Priority:  major  Milestone:  sage7.3 
Component:  python3  Keywords:  
Cc:  Merged in:  
Authors:  Jeroen Demeyer  Reviewers:  Frédéric Chapoton 
Report Upstream:  N/A  Work issues:  
Branch:  52fd0ce (Commits, GitHub, GitLab)  Commit:  52fd0ce06e69b95582edf1a1401159cb44c96430 
Dependencies:  Stopgaps: 
Description (last modified by )
To solve #17807, we preparse 0100
as Integer('0100')
instead of Integer(0100)
: thanks to #17413, this will give a deprecation warning so that users should know something funny is going on:
sage: 0100 /usr/local/src/sagegit/local/lib/python2.7/sitepackages/IPython/core/interactiveshell.py:2883: DeprecationWarning: use 0o as octal prefix instead of 0 If you do not want this number to be interpreted as octal, remove the leading zeros. See http://trac.sagemath.org/17413 for details. exec(code_obj, self.user_global_ns, self.user_ns) 64
We also preparse newstyle binary/octal/hex strings as integers instead of strings, which should be faster, see 5.
Change History (22)
comment:1 Changed 8 years ago by
Description:  modified (diff) 

comment:2 Changed 8 years ago by
Description:  modified (diff) 

comment:3 Changed 8 years ago by
Branch:  → u/jdemeyer/ticket/17808 

Created:  Feb 19, 2015, 8:31:31 AM → Feb 19, 2015, 8:31:31 AM 
Modified:  Feb 19, 2015, 8:36:04 AM → Feb 19, 2015, 8:36:04 AM 
comment:4 Changed 8 years ago by
Authors:  → Jeroen Demeyer 

Commit:  → 533af4ce9ae65f71d38177052fccb28cc69234e7 
Status:  new → needs_review 
comment:5 followup: 6 Changed 8 years ago by
I would think the slowdown for small integers makes the proposed changes very unattractive. The whole idea of preparsing 100
as Integer(100)
is that in the AST and the generated byte code, the constant 100
is already stored as a numerical one, and hence conversion is much faster.
For instance, in code like:
sage: %timeit L=[Integer('1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000') for i in range(10000r)] 100 loops, best of 3: 11.4 ms per loop sage: %timeit L=[Integer(1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000) for i in range(10000r)] 100 loops, best of 3: 4.66 ms per loop
you really start to notice this.
preparse_file mitigates issues like this a little bit by factoring out numeric constants, so that their conversion doesn't happen in inner loops.
If you want to store your numerical constants as strings in the byte code and have fast conversion, you should probably store them in hex. It's a little more work in the preparser, but the result is more compact and conversion to a numerical constant is truly linear in number of bits.
comment:6 followup: 7 Changed 8 years ago by
Replying to nbruin:
The whole idea of preparsing
100
asInteger(100)
is that in the AST and the generated byte code, the constant100
is already stored as a numerical one
Is that really important? The "timeit" example is of course a bit artificial.
comment:7 Changed 8 years ago by
Replying to jdemeyer:
Replying to nbruin:
The whole idea of preparsing
100
asInteger(100)
is that in the AST and the generated byte code, the constant100
is already stored as a numerical oneIs that really important? The "timeit" example is of course a bit artificial.
I think so. Storing the numerical constant as a string forces the decimaltobinary conversion to happen at runtime. If you just write it as a python integer, the conversion happens at parsing, so by the time it's an ast and the compiler looks at it, you're already dealing with an integer.
Unfortunately, python doesn't permit compiletime macros (perhaps if we're ever going to rewrite our preparser to properly parse, we can combine it with MacroPy and then we can have Integer
object creation at compile time)
I am fairly certain that nearly all code that gets run with sage has TONS of small integer literals in it. Probably for many applications, people would get better performance writing 100r
, but of course nobody will. This ticket would deteriorate the much more common situation in favour of a small gain in the extremely rare situation that someone wants to include an incredibly large integer literal in their code (talk about artificialI don't think having a (small) numerical literal in an inner loop is artificial at all!). And the extremely rare situation can be solved by writing quotes already (and by the time you're writing such a long string you might as well convert it to hex anyway, in which case Python and GMP are (at least asymptotically) comparable.
If you use preparse_file
instead of preparse
, you'll see that numerical constants get pushed outside (which has its own set of problems, see #11542), which mitigates the stringconversionininnerloop.
comment:9 Changed 8 years ago by
Description:  modified (diff) 

Summary:  Preparse integers as strings → Preparse oldstyle octals as strings 
comment:10 Changed 8 years ago by
Description:  modified (diff) 

comment:11 Changed 8 years ago by
Commit:  533af4ce9ae65f71d38177052fccb28cc69234e7 → f72c449a23193744f4b9485a57b550fab2201b4f 

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
f72c449  Preparse oldstyle octal numbers as strings

comment:12 Changed 8 years ago by
Status:  needs_work → needs_review 

comment:13 Changed 8 years ago by
Branch:  u/jdemeyer/ticket/17808 → public/ticket/17808 

Commit:  f72c449a23193744f4b9485a57b550fab2201b4f → b469c4c747eefc73af5ac43818d933b7c28fe889 
Milestone:  sage6.6 → sage6.7 
comment:14 Changed 7 years ago by
Component:  misc → python3 

comment:15 Changed 7 years ago by
Commit:  b469c4c747eefc73af5ac43818d933b7c28fe889 → 52fd0ce06e69b95582edf1a1401159cb44c96430 

comment:16 Changed 7 years ago by
Milestone:  sage6.7 → sage7.2 

comment:17 followup: 18 Changed 7 years ago by
Milestone:  sage7.2 → sage7.3 

is the title still acurately describing the content of the ticket ?
comment:18 Changed 7 years ago by
comment:19 Changed 7 years ago by
Description:  modified (diff) 

Although the ticket does change a little bit more than that, I added a sentence to the description.
comment:20 Changed 7 years ago by
Description:  modified (diff) 

comment:21 Changed 7 years ago by
Reviewers:  → Frédéric Chapoton 

Status:  needs_review → positive_review 
ok, looks good to me
comment:22 Changed 7 years ago by
Branch:  public/ticket/17808 → 52fd0ce06e69b95582edf1a1401159cb44c96430 

Resolution:  → fixed 
Status:  positive_review → closed 
New commits:
Preparse integers as strings