Opened 11 years ago
Closed 10 years ago
#7496 closed defect (fixed)
symbolic variable names should be valid identifiers, or ridiculousness follows
Reported by: | was | Owned by: | burcin |
---|---|---|---|
Priority: | major | Milestone: | sage-4.7.2 |
Component: | symbolics | Keywords: | sd31 |
Cc: | Merged in: | sage-4.7.2.alpha0 | |
Authors: | Volker Braun, Karl-Dieter Crisman | Reviewers: | Karl-Dieter Crisman, Volker Braun |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
WTF?
sage: var('1') 1 sage: var('1')+1 1 + 1 sage: expand((var('2')+var('2'))^2) 4*2^2
When this ticket is closed, #9724 can also be closed.
Apply trac_7496_check_symbolic_variable_names.patch and trac_7496-reviewer.2.patch.
Attachments (3)
Change History (20)
comment:1 Changed 11 years ago by
- Report Upstream set to N/A
comment:2 Changed 10 years ago by
As long as we don't have Python 3, I would try to find a regular expression that does what we need. It is of course easy to write a regular expression that works for ascii strings. But as soon as 'ä' or other language-specific letters are supposed to be considered a variable name, things will become difficult.
I tried
sage: import re sage: _identifiers = re.compile("(?!\d)\w*\Z", re.LOCALE|re.UNICODE)
It works for simple cases:
sage: _identifiers.match('k_1') <_sre.SRE_Match object at 0x50d1030> sage: _identifiers.match('1k') sage: _identifiers.match('_1k') <_sre.SRE_Match object at 0x50d1238>
But it ignores other letters:
sage: print _identifiers.match('ä') None
Perhaps I have misunderstood the effect of re.LOCALE
? What value should re.LOCALE
have in order to work with accented letters?
comment:3 Changed 10 years ago by
Whatever we do, let's be sure to avoid locality dependance on what a valid symbol is. Let's look into back-porting whatever Python3 does, and then falling back to that once we make the transition.
comment:4 Changed 10 years ago by
- Keywords sd31 added
- Status changed from new to needs_review
comment:5 Changed 10 years ago by
- Description modified (diff)
comment:6 Changed 10 years ago by
- Reviewers set to Karl-Dieter Crisman
This code should be correct - nice to learn something about the parser and bytecode. Is this how it's implemented in Python 3 (presumably that's written in C, though)? I couldn't find it, anyway.
This needs doctests just to document that var('3')
and var(' ')
(see #9724) fail, though of course the tests already here document this indirectly. I'll post a reviewer patch momentarily, assuming I didn't miss something and the patch doesn't actually work.
comment:7 Changed 10 years ago by
- Status changed from needs_review to needs_info
Ok, I think this is still ok, though I am a little concerned about both of the following being bad:
sage: var(' x') (, x)
not good because an empty string is a variable
sage: var(' x') --------------------------------------------------------------------------- ValueError: The name "" is not a valid Python identifier.
not good because the intent is clear to make precisely x the variable.
So is this breaking incorrect but usable behavior?
sage: var("x y z") (x, y, , z) sage:
is similar.
Anyway, I withhold judgment on this. Reviewer patch attached, but 'needs info' on this. At the least it seems reasonable to open a new ticket to allow the above behavior - one could easily remove empty strings from the list names_list
and then complain if there are none left, for instance.
comment:8 Changed 10 years ago by
- Description modified (diff)
comment:9 Changed 10 years ago by
- Reviewers changed from Karl-Dieter Crisman to Karl-Dieter Crisman, Volker Braun
- Status changed from needs_info to needs_review
Updated patch splits with any white space and not only on single space:
sage: var(' x y z ') (x, y, z) sage: var(' x , y , z ') (x, y, z)
I'm giving the reviewer patch a positive review.
comment:10 Changed 10 years ago by
- Status changed from needs_review to needs_work
Thanks. So far it looks good.
But with the original version of the patch (and almost certainly the new one) we get the following doctest error (was letting them run overnight, my apologies):
sage -t -long "devel/sage/sage/calculus/desolvers.py" ********************************************************************** File "/Users/.../sage-4.7.1.alpha2/devel/sage/sage/calculus/desolvers.py", line 1430: sage: sol=desolve_odeint(f,[0.5,2],srange(0,10,0.1),[x,y]) Exception raised: sol=desolve_odeint(f,[RealNumber('0.5'),Integer(2)],srange(Integer(0),Integer(10),RealNumber('0.1')),[x,y])###line 1430: sage: sol=desolve_odeint(f,[0.5,2],srange(0,10,0.1),[x,y]) File "/Users/.../sage-4.7.1.alpha2/local/lib/python/site-packages/sage/calculus/desolvers.py", line 1501, in desolve_odeint ivar = var(safe_name) File "ring.pyx", line 798, in sage.symbolic.ring.var (sage/symbolic/ring.cpp:7745) File "ring.pyx", line 506, in sage.symbolic.ring.SymbolicRing.var (sage/symbolic/ring.cpp:6272) File "ring.pyx", line 534, in sage.symbolic.ring.SymbolicRing.var (sage/symbolic/ring.cpp:6044) ValueError: The name "t_[x" is not a valid Python identifier. ********************************************************************** <snip two failures that result from sol not being defined> ********************************************************************** File "/Users/.../sage-4.7.1.alpha2/devel/sage/sage/calculus/desolvers.py", line 1446: sage: sol=desolve_odeint(lorenz,ics,times,[x,y,z],rtol=1e-13,atol=1e-14) Exception raised: File "<doctest __main__.example_12[15]>", line 1, in <module> sol=desolve_odeint(lorenz,ics,times,[x,y,z],rtol=RealNumber('1e-13'),atol=RealNumber('1e-14'))###line 1446: sage: sol=desolve_odeint(lorenz,ics,times,[x,y,z],rtol=1e-13,atol=1e-14) File "/Users/.../sage-4.7.1.alpha2/local/lib/python/site-packages/sage/calculus/desolvers.py", line 1501, in desolve_odeint ivar = var(safe_name) File "ring.pyx", line 798, in sage.symbolic.ring.var (sage/symbolic/ring.cpp:7745) File "ring.pyx", line 506, in sage.symbolic.ring.SymbolicRing.var (sage/symbolic/ring.cpp:6272) File "ring.pyx", line 534, in sage.symbolic.ring.SymbolicRing.var (sage/symbolic/ring.cpp:6044) ValueError: The name "t_[x" is not a valid Python identifier. ********************************************************************** File "/Users/.../sage-4.7.1.alpha2/devel/sage/sage/calculus/desolvers.py", line 1470: sage: sol=desolve_odeint(f,ci,t,v,rtol=1e-3,atol=1e-4,h0=0.1,hmax=1,hmin=1e-4,mxstep=1000,mxords=17) Exception raised: File "<doctest __main__.example_12[32]>", line 1, in <module> sol=desolve_odeint(f,ci,t,v,rtol=RealNumber('1e-3'),atol=RealNumber('1e-4'),h0=RealNumber('0.1'),hmax=Integer(1),hmin=RealNumber('1e-4'),mxstep=Integer(1000),mxords=Integer(17))###line 1470: sage: sol=desolve_odeint(f,ci,t,v,rtol=1e-3,atol=1e-4,h0=0.1,hmax=1,hmin=1e-4,mxstep=1000,mxords=17) File "/Users/.../sage-4.7.1.alpha2/local/lib/python/site-packages/sage/calculus/desolvers.py", line 1501, in desolve_odeint ivar = var(safe_name) File "ring.pyx", line 798, in sage.symbolic.ring.var (sage/symbolic/ring.cpp:7745) File "ring.pyx", line 506, in sage.symbolic.ring.SymbolicRing.var (sage/symbolic/ring.cpp:6272) File "ring.pyx", line 534, in sage.symbolic.ring.SymbolicRing.var (sage/symbolic/ring.cpp:6044) ValueError: The name "t_[y1" is not a valid Python identifier. **********************************************************************
comment:11 follow-up: ↓ 12 Changed 10 years ago by
- Status changed from needs_work to needs_review
Somehow I'm not surprised that somebody used invalid identifiers as variable names ;-)
Updated patch follows.
comment:12 in reply to: ↑ 11 Changed 10 years ago by
- Status changed from needs_review to needs_work
Replying to vbraun:
Somehow I'm not surprised that somebody used invalid identifiers as variable names ;-)
So this actually found a bug of sorts - nice. Though I have to say that trying to go between the Maxima way of all variables automatic and the Sage way of (nearly) none gave the folks who put that file together quite a challenge, so I don't blame them too much. But the code is not so elegant, in retrospect.
What's particularly bizarre about this is that in the old and the new cases, fast_float
is being passed something with three arguments, so it still behaves properly - even though the third thing is an iterable in both cases! Probably the "right" fix is to just do 't_'+str(dvar)
for the first dvar only; fast_float just needs something different from the dependent variables. But I think that would be another ticket, as this doesn't break anything (and the plots still look the same).
Anyway, it fixes the test, doesn't seem to introduce any new bugs.
However, the update to allowing whitespace probably still needs work. Maybe we need to special-case the situation where there is only whitespace in the string passed to var
:
sage: var(' ') sage: a = var(' ') sage: a sage: type(a) <type 'sage.symbolic.expression.Expression'>
Sorry :(
comment:13 Changed 10 years ago by
- Status changed from needs_work to needs_review
Oh yes good catch, the elements of the empty set (no variable after stripping whitespace) satisfy every property!
Fixed in the updated patch.
comment:14 Changed 10 years ago by
- Description modified (diff)
The error message for the "empty variable" case changed in my patch, I just updated the reviewer patch to take that into account.
comment:15 Changed 10 years ago by
- Status changed from needs_review to positive_review
Okay, I think that this is finally in shape. Thanks for the back and forth on this.
comment:16 Changed 10 years ago by
- Milestone changed from sage-4.7.1 to sage-4.7.2
comment:17 Changed 10 years ago by
- Merged in set to sage-4.7.2.alpha0
- Resolution set to fixed
- Status changed from positive_review to closed
This would be really easy to implement once we have python 3. Apparently, valid identifiers got extended:
http://www.python.org/dev/peps/pep-3131/
String literals also got an
isidentifier()
method.I don't know how to do this efficiently, while still allowing people to just use α (alpha) as a variable name.