Opened 19 months ago
Closed 15 months ago
#29278 closed enhancement (fixed)
Callable symbolic expressions should be allowed to have unicode identifiers
Reported by:  rburing  Owned by:  

Priority:  major  Milestone:  sage9.2 
Component:  symbolics  Keywords:  callable symbolic expression, unicode, preparse, preparser, preparse_calculus 
Cc:  tscrim  Merged in:  
Authors:  Markus Wageringel  Reviewers:  Travis Scrimshaw 
Report Upstream:  N/A  Work issues:  
Branch:  d4baa72 (Commits, GitHub, GitLab)  Commit:  d4baa72231ee7d4f1e2cfacda1628e47fdba578f 
Dependencies:  #29391  Stopgaps: 
Description
sage: μ(x) = x^2 File "<ipythoninput2093337e923470>", line 1 μ(x) = x**Integer(2) ^ SyntaxError: can't assign to function call
This is a bug in the preparser. Running
from sage.repl.preparse import preparse_calculus preparse_calculus??
shows that in order to find the identifier it uses a regex with the range [azAZ_]
, which is oldfashioned since 9.0.
Note that there is no LaTeX issue here because callable symbolic expressions don't know their own names; the name is only used for the Python identifier.
Change History (13)
comment:1 Changed 19 months ago by
 Type changed from defect to enhancement
comment:2 Changed 19 months ago by
Are you working on this? I have been looking into this a little bit and could create a branch.
IMO, this is very much a bug. The preparser should have nothing to do with other interfaces. If a particular interface does not support unicode, it needs to be caught in that interface, not in the preparser.
Note also that the following works just fine:
sage: aμ(x) = x^2
The problem only exists if the first character is not ASCII.
comment:3 Changed 19 months ago by
 Branch set to u/ghmwageringel/29278
 Commit set to e5c421ced57321338409e8b184af9089a1c69e58
 Status changed from new to needs_review
I have updated all similar regular expressions in the preparser.
This also passes the tests with Python 2.
New commits:
e5c421c  29278: make preparser unicodecompatible

comment:4 followup: ↓ 5 Changed 19 months ago by
Thanks for working on this! Some of the original regular expressions purposely do not include digits, because Python identifiers should not start with digits. This still needs to be taken into account (and perhaps tested).
comment:5 in reply to: ↑ 4 Changed 19 months ago by
Replying to rburing:
Some of the original regular expressions purposely do not include digits, because Python identifiers should not start with digits. This still needs to be taken into account (and perhaps tested).
In place of [azAZ_]
, I used [^\W\d]
to account for that, which excludes numbers and nonalphanumeric characters, so effectively only letters and underscores remain (for reference: the re docs).
comment:6 Changed 19 months ago by
 Commit changed from e5c421ced57321338409e8b184af9089a1c69e58 to 0de5617789df073b053727cb91ee6aef86b06fc2
Branch pushed to git repo; I updated commit sha1. New commits:
0de5617  29278: fix escape sequence

comment:7 Changed 19 months ago by
 Commit changed from 0de5617789df073b053727cb91ee6aef86b06fc2 to bcfe9d4262b07adea89e35d7cc037c9a5af7ce05
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
bcfe9d4  29278: fix escape sequence

comment:8 Changed 18 months ago by
 Commit changed from bcfe9d4262b07adea89e35d7cc037c9a5af7ce05 to d4baa72231ee7d4f1e2cfacda1628e47fdba578f
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
c6ce67a  29391: update keywords in preparser for Python 3

d7dade6  29391: avoid implicit multiplication with print/exec

f0d5691  29391: document level in implicit_multiplication

294ec02  29391: reviewer patch

d4baa72  29278: make preparser unicodecompatible

comment:9 Changed 18 months ago by
 Dependencies set to #29391
Rebased on top of #29391.
By the way, IPython has accepted a patch for a very similar problem: https://github.com/ipython/ipython/pull/12208/files
comment:10 Changed 18 months ago by
 Milestone changed from sage9.1 to sage9.2
Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.
comment:11 Changed 15 months ago by
 Cc tscrim added
comment:12 Changed 15 months ago by
 Reviewers set to Travis Scrimshaw
 Status changed from needs_review to positive_review
LGTM.
comment:13 Changed 15 months ago by
 Branch changed from u/ghmwageringel/29278 to d4baa72231ee7d4f1e2cfacda1628e47fdba578f
 Resolution set to fixed
 Status changed from positive_review to closed
It's not a really a bug. Perhaps it's not properly documented.
Identifier names that make it into SR should be very conservative, because they probably get shipped to all kinds of other interfaces  by name! As pointed out in the ticket, the name used in the definition of a callable symbolic expression ends up only being used as a python identifier, so perhaps it's fine to relax the relevant regular expression.