#30378 closed defect (fixed)
(x^2).subs({x: sqrt(x)}) returns sqrt(x) instead of x
Reported by:  egourgoulhon  Owned by:  

Priority:  blocker  Milestone:  sage9.3 
Component:  symbolics  Keywords:  sqrt, substitution, pynac 
Cc:  nbruin, rws, kcrisman, charpent  Merged in:  
Authors:  Dave Morris  Reviewers:  Matthias Koeppe 
Report Upstream:  N/A  Work issues:  
Branch:  8d4db46 (Commits, GitHub, GitLab)  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
As discussed in https://groups.google.com/d/msg/sagedevel/kYZ4N73oiCM/iDyQLWyBAAJ, we have currently
sage: (x^2).subs({x: sqrt(x)}) sqrt(x)
A consequence of this bug is
sage: f(x) = x^2 sage: f(sqrt(x)) sqrt(x)
Note that the following outputs are correct:
sage: (x^2).subs({x: sqrt(pi)}) pi sage: (x^2).subs({x: x^(1/3)}) x^(2/3) sage: y = var('y') sage: (x^2).subs({x: sqrt(y)}) y
Moreover, distinct powers of x
seem OK:
sage: (x^3).subs({x: sqrt(x)}) x^(3/2) sage: (x^(2)).subs({x: sqrt(x)}) 1/x
But a match is problematic:
sage: (x^3).subs({x:x^(1/3)}) x^(1/3)
Change History (50)
comment:1 Changed 8 months ago by
 Cc nbruin rws added
comment:2 Changed 8 months ago by
 Priority changed from major to critical
comment:3 Changed 8 months ago by
 Description modified (diff)
comment:4 Changed 8 months ago by
 Description modified (diff)
comment:5 Changed 8 months ago by
 Cc kcrisman added
comment:6 Changed 8 months ago by
comment:7 followup: ↓ 8 Changed 8 months ago by
I am not sure how to fix this either  presumably it was precisely for some sort of wildcard situation that this was in there  but I do note the following in expression.pyx
:
sage: t = a^2 + b^2 + (x+y)^3 ... sage: t.subs({w0^2: w0^3}) a^3 + b^3 + (x + y)^3 Substitute with one or more relational expressions:: sage: t.subs(w0^2 == w0^3) a^3 + b^3 + (x + y)^3
so far, so good
sage: t.subs(w0 == w0^2) a^8 + b^8 + (x^2 + y^2)^6
Wait, what? Shouldn't these be fourth powers? That x
and y
get replaced by their squares is fine, as is the sixth power, but I'm not at all sure about the eighth powers. (Also note that the wildcard only found literal squares in the {w0^2: w0^3}
case, where the cube was not replaced by 1.5 squares  which is presumably good.)
Note that the corresponding Ginac code always returns the subs_one_level
business in this situation where either the exponent or the base are 'not trivially equal', whatever that means in this context. Unfortunately I can't find out from the commit history where this changed in Pynac  it seems to go back to Burcin's original rewrite back when we changed to Pynac/Ginac in the first place for basic symbolics a decade ago.
comment:8 in reply to: ↑ 7 Changed 8 months ago by
Replying to kcrisman:
so far, so good
sage: t.subs(w0 == w0^2) a^8 + b^8 + (x^2 + y^2)^6Wait, what? Shouldn't these be fourth powers? That
x
andy
get replaced by their squares is fine, as is the sixth power, but I'm not at all sure about the eighth powers. (Also note that the wildcard only found literal squares in the{w0^2: w0^3}
case, where the cube was not replaced by 1.5 squares  which is presumably good.)
I can only make sense of the eighth powers in a setting where the substitution acts on a power like a^2
in such a way that both the base and the exponent get squared. Whether or not this is the intended behavior for a general wildcard substitution like this seems debateable; one could argue that it is actually inconsistent as the exponent of the parenthesis is not treated in the same way.
Unfortunately, make ptestlong
is currently broken for my system, but as far as I can tell from manual testing, there are no other obvious doctest errors other than this wildcard substitution one.
Note that the corresponding Ginac code always returns the
subs_one_level
business in this situation where either the exponent or the base are 'not trivially equal', whatever that means in this context. Unfortunately I can't find out from the commit history where this changed in Pynac  it seems to go back to Burcin's original rewrite back when we changed to Pynac/Ginac in the first place for basic symbolics a decade ago.
Interesting! I only saw that something concerning subs
was mentioned in #24262, and that the particular segment around L747 has been introduced in the upgrade on that ticket.
comment:9 Changed 8 months ago by
In the w0 == w0^2
example, I suspect that w0 matches a, so a gets squared, but w0 also matches the entire term a^2
, so it gets squared again: ((a^2)^2)^2 = a^8
. See, for example:
sage: t.subs(w0 == e^w0) e^((e^x + e^y)^3) + e^(e^(2*a)) + e^(e^(2*b)) sage: (cos(x)).subs(w0 == e^w0) e^(cos(e^x))
comment:10 Changed 8 months ago by
In the case of substitution where all keys are variables rather than general expressions, shouldn't we be able to write robust code that does not rely on pattern matching?
comment:11 Changed 8 months ago by
Presumably, but just keep in mind the "not reinventing the wheel" guideline. w0
in this example is an intentional wildcard. I'm still not 100% sure what is going on with the simpler case that originated the ticket  again, note the default Ginac behavior.
comment:12 Changed 8 months ago by
Several things to note here:
 pynac has flags that are supposed to disable pattern matching (subs_options::no_pattern, https://github.com/pynac/pynac/blob/85c43c230e8050a1f71898889a021fec28318502/ginac/flags.h#L61)
 but
power::subs
does not look at the pattern matching flag  also Sage's
Expression.substitute
never passes this or any other flags to Pynac'ssubs
method  I think any form of wildcarding (and iterated substitution) should be disabled in
CallableSymbolicExpressionRing_class._call_element_
(I think it's quite shocking that Sage's callable expressions are built upon general subs)
comment:13 Changed 8 months ago by
A fix for CallableSymbolicExpressionRing_class._call_element_
(or for a more disciplined version of substitute
restricted to the case of { VARIABLE_i: EXPRESSION_i }
substitution without pattern matching) would be the following:
 check if any
VARIABLE_i
appears in anyEXPRESSION_j
;  in that case, create unique fresh temporary variables and substitute them first
Higher level library code (such as in sage.manifold
) will probably not want to use the general form of subs
at all.
comment:14 Changed 6 months ago by
#30688 may be germane
Shouldn't this one be a blocker
(silently returns mathematically false results in (semi)trivial cases) ?
comment:15 Changed 6 months ago by
 Cc charpent added
comment:16 Changed 6 months ago by
 Cc charpent removed
I would agree that this should ideally be a blocker, but unfortunately we don't seem to have many active developers who have time to work on the symbolics code. Comment 13 sketches a possible implementation strategy that can be done purely within sagelib and does not require changes in pynac.
comment:17 Changed 6 months ago by
 Cc charpent added
comment:18 Changed 6 months ago by
this remark might or might not be relevant to the current problem.
comment:19 Changed 6 months ago by
 Milestone changed from sage9.2 to sage9.3
comment:20 Changed 6 months ago by
What can I do to help? Unfortunately, my knowledge about Ginac and its interface via SageMath is very limited. If you can give me a doc or something, I can try.
comment:21 Changed 6 months ago by
Comment 13 outlines an implementation strategy that does not require changes to pynac
comment:22 Changed 6 months ago by
I just looked into the code and I tried to make sense of e.g. _gobj
and smap
. But after your comment, I suspect the lines you propose to modify are above, right?
comment:23 Changed 6 months ago by
If you follow comment 13, you would not need to modify Expression.substitute
at all but instead replace the implementation of CallableSymbolicExpressionRing_class._call_element_
.
comment:24 Changed 4 months ago by
 Branch set to public/30378
comment:25 Changed 4 months ago by
 Commit set to 2d9bb2bcbe2f64906c703222f72784a85557000a
 Status changed from new to needs_review
A similar problem for (exp(x)).subs(x=ln(x))
was solved in #9891 by patching pynac.
I implemented a fix along the lines suggested by mkoeppe in comment:13. The PR solves the problem in the ticket description, but there is a bug that I don't understand:
sage: f(x) = x^2 sage: f(sqrt(x)) # #30378 is fixed x sage: t,y = var('t y') sage: g = function('g') sage: G(x) = integrate(g(t), t, 0, x) sage: G(y) integrate(g(t), t, 0, y) sage: G(x) # why doesn't the temporary variable get replaced by x? integrate(g(t), t, 0, symbol180) sage: G.subs(x=x) # why doesn't the temporary variable get replaced by x? x > integrate(g(t), t, 0, symbol192)
All answers are correct except the last two. I don't know much about cython or pynac, so suggestions are welcome. (For example, I don't know how to clear a GExMap, so I declared new ones, instead of reusing the variable smap
.)
New commits:
2d9bb2b  fix trac #30378

comment:26 Changed 4 months ago by
It's possible that the problem is in Ginac itself too, somehow. As for the new (and definitely bad) bug, sounds like some stuff isn't even being translated back into Sage. If you keep doing G(x)
does it repeat the same symbol number?
comment:27 Changed 4 months ago by
If I type G(x)
over and over, then the symbol number increases each time (e.g., symbol172
, symbol184
, symbol196
, symbol208
,...).
The first part of my patch seems to be working correctly: a new variable is created, and gets substituted in for x
. But it seems that (for reasons I do not understand) this new variable does not match when we make the second substitution, so it does not get replaced, and stays in the expression. (As partial confirmation, G(x^2)
yields integrate(g(t), t, 0, symbol254)
.) This problem does not arise for ordinary symbolic expressions (such as sin(x).subs(x=x^2)
), but it seems that integrate
messes things up somehow.
If I type G(x).variables()
, I get (symbol266, t)
(but perhaps with a different symbol number), so everything seems fine except that I would like to be getting x
, instead of symbol242
.
comment:28 Changed 4 months ago by
comment:29 Changed 4 months ago by
 Status changed from needs_review to needs_work
comment:30 Changed 3 months ago by
Possibly related : #31137
comment:31 Changed 3 months ago by
 Branch changed from public/30378 to public/30378v2
comment:32 Changed 3 months ago by
 Commit changed from 2d9bb2bcbe2f64906c703222f72784a85557000a to e5359843799eda063abb3090f12a3dcb1aa2b240
The bug in my previous PR was that the first substitution could trigger a simplification/evaluation, and this could result in a call to maxima. But maxima does not understand ginac's anonymous temporary variables:
sage: a = SR.symbol() sage: a # we have a new variable symbol195 sage: maxima(a) # it can be sent to maxima _SAGE_VAR_symbol195 sage: bool(maxima(a).sage() == a) # but it comes back to sage as a different variable False
The expression came back with a different variable than we expected, so the substitution could not match this unexpected new variable.
A follow up ticket could fix the bug properly, by adding a function to pynac that performs a straightforward substitution, without pattern matching or other complications.
PS This bug fix does not seem to be related to #31137 (which was fixed by the patch to pynac in #30446), and, unfortunately, it also does not seem to help with #30688 (which is still open).
New commits:
ee8bfce  fix trac #30378

e535984  update doctest error message

comment:33 Changed 3 months ago by
 Status changed from needs_work to needs_review
comment:34 Changed 2 months ago by
I noticed that the current solution does not work in instances such as:
sage: ((x + 2)^2).subs({x + 2:sqrt(x + 2)}) sqrt(x + 2)
I don't know of any direct consequences of this (maybe usubstitution?)
comment:35 Changed 2 months ago by
I think this example just illustrates that the comment # We are not in the problematic case, so we can trust Ginac to do the substitution.
is a little bit too optimistic.
I think the documentation of subs
should make a clear distinction between the case variable: substitution
(which is well defined mathematically) and the case expression: substitution
, warn that the latter case is depends on pattern matching and the subtle semantics of the simplification engine  so it should only be used for interactive manipulations but not in production code.
comment:36 Changed 2 months ago by
I'm sorry, I was sure I added exactly that type of warning to the docstring, but it's not there, and it doesn't seem to be in my local branches, so I must have deleted it somehow. That's too bad, because I thought it was pretty good, and had some instructive examples of what could go wrong. I'll write a new warning (if I can't find the old one on my hard drive).
I did know that there were still problems (which is why I tried to add the comment to the docstring) so I certainly agree that the comment is misleading. My PR was intended just to fix the special case of substituting values into variables. I think this (plus a warning in the docstring) is the minimum we need for 9.3.
In the long run, I think there should be two different methods. Part of the problem is that subs
is the wrong name for the current functionality. The pynac method is not just doing a substitution, but applying an identity. For example, if you know that x^2 y^2 = x y + 1
, then instead of only apply the substitution once, you might want to apply the identity again if more terms have appeared that can be simplified. For example (even with my patch):
sage: cos(cos(cos(cos(cos(x))))).subs({cos(x) : x}) x
It should be easy to upgrade my patch to solve the case in comment:34. If we do that, then I think we should have a keyword argument to enable the original pynac interpretation of subs
. However, I should point out that I don't think the simple method used in the patch would be able to handle wildcard variables correctly. (Anyone using wildcard variables can be expected to know what they are doing, so I think that is probably ok.) Is the original PR sufficient, or should we fix comment:34 for 9.3?
comment:37 Changed 2 months ago by
What is implemented on this ticket already is good enough to fix the critical issue with function application,
sage: f(x) = x^2 sage: f(sqrt(x))
I think more general tweaks to the subs algorithm should be addressed in pynac.
comment:38 Changed 2 months ago by
 Priority changed from critical to blocker
comment:39 Changed 2 months ago by
 Commit changed from e5359843799eda063abb3090f12a3dcb1aa2b240 to 91bebd0b9de8fb0724216c79a3dde611fdffff67
Branch pushed to git repo; I updated commit sha1. New commits:
91bebd0  add warning

comment:40 Changed 2 months ago by
I could not find the warning that I wrote before, so I drafted a new one. Improvements are welcome.
comment:41 Changed 2 months ago by
 Reviewers set to Matthias Koeppe
 Status changed from needs_review to positive_review
Looking great, thanks very much for this work.
comment:42 Changed 2 months ago by
Thanks to you, too. My PR just implemented your suggestion in comment:13.
comment:43 Changed 2 months ago by
 Status changed from positive_review to needs_work
Docbuild fails, see patchbot
comment:44 Changed 2 months ago by
 Branch changed from public/30378v2 to public/30378v3
comment:45 Changed 2 months ago by
 Commit changed from 91bebd0b9de8fb0724216c79a3dde611fdffff67 to 8d4db4686b31902f7c411637431ff349d4de5191
 Status changed from needs_work to needs_review
comment:46 Changed 2 months ago by
 Status changed from needs_review to positive_review
comment:47 Changed 2 months ago by
Thanks!
comment:48 Changed 6 weeks ago by
 Branch changed from public/30378v3 to 8d4db4686b31902f7c411637431ff349d4de5191
 Resolution set to fixed
 Status changed from positive_review to closed
comment:49 Changed 3 weeks ago by
 Commit 8d4db4686b31902f7c411637431ff349d4de5191 deleted
Unfortunately, this fix causes more embarrassing bugs:
sage: var('a b epsilon') (a, b, epsilon) sage: (a+b*epsilon).series(epsilon,2).subs(a=a,b=b) b*epsilon
comment:50 Changed 3 weeks ago by
Thanks for catching this. I have opened #31530 for this
I did some debugging and can report the following: this bug happens in https://github.com/pynac/pynac/blob/85c43c230e8050a1f71898889a021fec28318502/ginac/power.cpp#L747; in fact the correct power in the case of the substitution
x=sqrt(x)
inx^2
is constructed inp
, but then, for a reason I am not quite sure of, the substitution gets applied to the correct output again (L748) and then compared to the original expression,x^2
, (L749) where it is decided to return the result with the substitution applied once again and thussqrt(x)
is obtained.Changing the code such that
p
from L747 is returned causes one doctest to fail, namely:(Context:
t = a^2 + b^2 + (x+y)^3
andw0
is a wildcard.)I don't really see a good solution right now, maybe someone else has some thoughts?