Opened 14 months ago

Closed 8 months ago

Last modified 7 months ago

#30378 closed defect (fixed)

(x^2).subs({x: sqrt(x)}) returns sqrt(x) instead of x

Reported by: egourgoulhon Owned by:
Priority: blocker Milestone: sage-9.3
Component: symbolics Keywords: sqrt, substitution, pynac
Cc: nbruin, rws, kcrisman, charpent Merged in:
Authors: Dave Morris Reviewers: Matthias Koeppe
Report Upstream: N/A Work issues:
Branch: 8d4db46 (Commits, GitHub, GitLab) Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by nbruin)

As discussed in https://groups.google.com/d/msg/sage-devel/kYZ4N73oiCM/-iDyQLWyBAAJ, we have currently

sage: (x^2).subs({x: sqrt(x)})                                                  
sqrt(x)

A consequence of this bug is

sage: f(x) = x^2                                                                
sage: f(sqrt(x))                                                                
sqrt(x)

Note that the following outputs are correct:

sage: (x^2).subs({x: sqrt(pi)})                                                 
pi
sage: (x^2).subs({x: x^(1/3)})                                                  
x^(2/3)
sage: y = var('y')                                                              
sage: (x^2).subs({x: sqrt(y)})                                                  
y

Moreover, distinct powers of x seem OK:

sage: (x^3).subs({x: sqrt(x)})                                                  
x^(3/2)
sage: (x^(-2)).subs({x: sqrt(x)})                                               
1/x

But a match is problematic:

sage: (x^3).subs({x:x^(1/3)})
x^(1/3)

Change History (50)

comment:1 Changed 14 months ago by egourgoulhon

  • Cc nbruin rws added

comment:2 Changed 14 months ago by mkoeppe

  • Priority changed from major to critical

comment:3 Changed 14 months ago by nbruin

  • Description modified (diff)

comment:4 Changed 14 months ago by nbruin

  • Description modified (diff)

comment:5 Changed 14 months ago by kcrisman

  • Cc kcrisman added

comment:6 Changed 14 months ago by behackl

I did some debugging and can report the following: this bug happens in https://github.com/pynac/pynac/blob/85c43c230e8050a1f71898889a021fec28318502/ginac/power.cpp#L747; in fact the correct power in the case of the substitution x=sqrt(x) in x^2 is constructed in p, but then, for a reason I am not quite sure of, the substitution gets applied to the correct output again (L748) and then compared to the original expression, x^2, (L749) where it is decided to return the result with the substitution applied once again and thus sqrt(x) is obtained.

Changing the code such that p from L747 is returned causes one doctest to fail, namely:

sage -t src/sage/symbolic/expression.pyx
**********************************************************************
File "src/sage/symbolic/expression.pyx", line 5150, in sage.symbolic.expression.Expression.substitute
Failed example:
    t.subs(w0 == w0^2)
Expected:
    a^8 + b^8 + (x^2 + y^2)^6
Got:
    a^4 + b^4 + (x^2 + y^2)^3
**********************************************************************
1 item had failures:
   1 of  68 in sage.symbolic.expression.Expression.substitute
    [2871 tests, 1 failure, 34.59 s]

(Context: t = a^2 + b^2 + (x+y)^3 and w0 is a wildcard.)

I don't really see a good solution right now, maybe someone else has some thoughts?

comment:7 follow-up: Changed 14 months ago by kcrisman

I am not sure how to fix this either - presumably it was precisely for some sort of wildcard situation that this was in there - but I do note the following in expression.pyx:

            sage: t = a^2 + b^2 + (x+y)^3

...
            sage: t.subs({w0^2: w0^3})
            a^3 + b^3 + (x + y)^3

        Substitute with one or more relational expressions::

            sage: t.subs(w0^2 == w0^3)
            a^3 + b^3 + (x + y)^3

so far, so good

            sage: t.subs(w0 == w0^2)
            a^8 + b^8 + (x^2 + y^2)^6

Wait, what? Shouldn't these be fourth powers? That x and y get replaced by their squares is fine, as is the sixth power, but I'm not at all sure about the eighth powers. (Also note that the wildcard only found literal squares in the {w0^2: w0^3} case, where the cube was not replaced by 1.5 squares - which is presumably good.)

Note that the corresponding Ginac code always returns the subs_one_level business in this situation where either the exponent or the base are 'not trivially equal', whatever that means in this context. Unfortunately I can't find out from the commit history where this changed in Pynac - it seems to go back to Burcin's original rewrite back when we changed to Pynac/Ginac in the first place for basic symbolics a decade ago.

Last edited 14 months ago by kcrisman (previous) (diff)

comment:8 in reply to: ↑ 7 Changed 14 months ago by behackl

Replying to kcrisman:

so far, so good

            sage: t.subs(w0 == w0^2)
            a^8 + b^8 + (x^2 + y^2)^6

Wait, what? Shouldn't these be fourth powers? That x and y get replaced by their squares is fine, as is the sixth power, but I'm not at all sure about the eighth powers. (Also note that the wildcard only found literal squares in the {w0^2: w0^3} case, where the cube was not replaced by 1.5 squares - which is presumably good.)

I can only make sense of the eighth powers in a setting where the substitution acts on a power like a^2 in such a way that both the base and the exponent get squared. Whether or not this is the intended behavior for a general wildcard substitution like this seems debateable; one could argue that it is actually inconsistent as the exponent of the parenthesis is not treated in the same way.

Unfortunately, make ptestlong is currently broken for my system, but as far as I can tell from manual testing, there are no other obvious doctest errors other than this wildcard substitution one.

Note that the corresponding Ginac code always returns the subs_one_level business in this situation where either the exponent or the base are 'not trivially equal', whatever that means in this context. Unfortunately I can't find out from the commit history where this changed in Pynac - it seems to go back to Burcin's original rewrite back when we changed to Pynac/Ginac in the first place for basic symbolics a decade ago.

Interesting! I only saw that something concerning subs was mentioned in #24262, and that the particular segment around L747 has been introduced in the upgrade on that ticket.

comment:9 Changed 14 months ago by gh-DaveWitteMorris

In the w0 == w0^2 example, I suspect that w0 matches a, so a gets squared, but w0 also matches the entire term a^2, so it gets squared again: ((a^2)^2)^2 = a^8. See, for example:

sage: t.subs(w0 == e^w0)                                                                                                  
e^((e^x + e^y)^3) + e^(e^(2*a)) + e^(e^(2*b))
sage: (cos(x)).subs(w0 == e^w0)                                                                                           
e^(cos(e^x))

comment:10 Changed 14 months ago by mkoeppe

In the case of substitution where all keys are variables rather than general expressions, shouldn't we be able to write robust code that does not rely on pattern matching?

comment:11 Changed 14 months ago by kcrisman

Presumably, but just keep in mind the "not reinventing the wheel" guideline. w0 in this example is an intentional wildcard. I'm still not 100% sure what is going on with the simpler case that originated the ticket - again, note the default Ginac behavior.

comment:12 Changed 14 months ago by mkoeppe

Several things to note here:

  • pynac has flags that are supposed to disable pattern matching (subs_options::no_pattern, https://github.com/pynac/pynac/blob/85c43c230e8050a1f71898889a021fec28318502/ginac/flags.h#L61)
  • but power::subs does not look at the pattern matching flag
  • also Sage's Expression.substitute never passes this or any other flags to Pynac's subs method
  • I think any form of wildcarding (and iterated substitution) should be disabled in CallableSymbolicExpressionRing_class._call_element_ (I think it's quite shocking that Sage's callable expressions are built upon general subs)
Last edited 14 months ago by mkoeppe (previous) (diff)

comment:13 Changed 14 months ago by mkoeppe

A fix for CallableSymbolicExpressionRing_class._call_element_ (or for a more disciplined version of substitute restricted to the case of { VARIABLE_i: EXPRESSION_i } substitution without pattern matching) would be the following:

  • check if any VARIABLE_i appears in any EXPRESSION_j;
  • in that case, create unique fresh temporary variables and substitute them first

Higher level library code (such as in sage.manifold) will probably not want to use the general form of subs at all.

Last edited 14 months ago by mkoeppe (previous) (diff)

comment:14 Changed 13 months ago by charpent

#30688 may be germane

Shouldn't this one be a blocker (silently returns mathematically false results in (semi-)trivial cases) ?

Last edited 13 months ago by charpent (previous) (diff)

comment:15 Changed 13 months ago by charpent

  • Cc charpent added

comment:16 Changed 13 months ago by mkoeppe

  • Cc charpent removed

I would agree that this should ideally be a blocker, but unfortunately we don't seem to have many active developers who have time to work on the symbolics code. Comment 13 sketches a possible implementation strategy that can be done purely within sagelib and does not require changes in pynac.

comment:17 Changed 13 months ago by mkoeppe

  • Cc charpent added

comment:18 Changed 13 months ago by charpent

this remark might or might not be relevant to the current problem.

comment:19 Changed 12 months ago by mkoeppe

  • Milestone changed from sage-9.2 to sage-9.3

comment:20 Changed 12 months ago by gh-mjungmath

What can I do to help? Unfortunately, my knowledge about Ginac and its interface via SageMath is very limited. If you can give me a doc or something, I can try.

comment:21 Changed 12 months ago by mkoeppe

Comment 13 outlines an implementation strategy that does not require changes to pynac

comment:22 Changed 12 months ago by gh-mjungmath

I just looked into the code and I tried to make sense of e.g. _gobj and smap. But after your comment, I suspect the lines you propose to modify are above, right?

comment:23 Changed 12 months ago by mkoeppe

If you follow comment 13, you would not need to modify Expression.substitute at all but instead replace the implementation of CallableSymbolicExpressionRing_class._call_element_.

comment:24 Changed 10 months ago by gh-DaveWitteMorris

  • Branch set to public/30378

comment:25 Changed 10 months ago by gh-DaveWitteMorris

  • Commit set to 2d9bb2bcbe2f64906c703222f72784a85557000a
  • Status changed from new to needs_review

A similar problem for (exp(x)).subs(x=ln(x)) was solved in #9891 by patching pynac.

I implemented a fix along the lines suggested by mkoeppe in comment:13. The PR solves the problem in the ticket description, but there is a bug that I don't understand:

sage: f(x) = x^2
sage: f(sqrt(x))  # #30378 is fixed
x
sage: t,y = var('t y')
sage: g = function('g')
sage: G(x) = integrate(g(t), t, 0, x)
sage: G(y)
integrate(g(t), t, 0, y)
sage: G(x)  # why doesn't the temporary variable get replaced by x?
integrate(g(t), t, 0, symbol180)
sage: G.subs(x=x)  # why doesn't the temporary variable get replaced by x?
x |--> integrate(g(t), t, 0, symbol192)

All answers are correct except the last two. I don't know much about cython or pynac, so suggestions are welcome. (For example, I don't know how to clear a GExMap, so I declared new ones, instead of reusing the variable smap.)


New commits:

2d9bb2bfix trac #30378

comment:26 Changed 10 months ago by kcrisman

It's possible that the problem is in Ginac itself too, somehow. As for the new (and definitely bad) bug, sounds like some stuff isn't even being translated back into Sage. If you keep doing G(x) does it repeat the same symbol number?

comment:27 Changed 10 months ago by gh-DaveWitteMorris

If I type G(x) over and over, then the symbol number increases each time (e.g., symbol172, symbol184, symbol196, symbol208,...).

The first part of my patch seems to be working correctly: a new variable is created, and gets substituted in for x. But it seems that (for reasons I do not understand) this new variable does not match when we make the second substitution, so it does not get replaced, and stays in the expression. (As partial confirmation, G(x^2) yields integrate(g(t), t, 0, symbol254).) This problem does not arise for ordinary symbolic expressions (such as sin(x).subs(x=x^2)), but it seems that integrate messes things up somehow.

If I type G(x).variables(), I get (symbol266, t) (but perhaps with a different symbol number), so everything seems fine except that I would like to be getting x, instead of symbol242.

comment:28 Changed 10 months ago by gh-DaveWitteMorris

  • Authors set to Dave Morris

comment:29 Changed 10 months ago by gh-DaveWitteMorris

  • Status changed from needs_review to needs_work

comment:30 Changed 10 months ago by charpent

Possibly related : #31137

comment:31 Changed 9 months ago by gh-DaveWitteMorris

  • Branch changed from public/30378 to public/30378v2

comment:32 Changed 9 months ago by gh-DaveWitteMorris

  • Commit changed from 2d9bb2bcbe2f64906c703222f72784a85557000a to e5359843799eda063abb3090f12a3dcb1aa2b240

The bug in my previous PR was that the first substitution could trigger a simplification/evaluation, and this could result in a call to maxima. But maxima does not understand ginac's anonymous temporary variables:

sage: a = SR.symbol()
sage: a  # we have a new variable
symbol195
sage: maxima(a)  # it can be sent to maxima
_SAGE_VAR_symbol195
sage: bool(maxima(a).sage() == a)  # but it comes back to sage as a different variable
False

The expression came back with a different variable than we expected, so the substitution could not match this unexpected new variable.

A follow up ticket could fix the bug properly, by adding a function to pynac that performs a straightforward substitution, without pattern matching or other complications.

PS This bug fix does not seem to be related to #31137 (which was fixed by the patch to pynac in #30446), and, unfortunately, it also does not seem to help with #30688 (which is still open).


New commits:

ee8bfcefix trac #30378
e535984update doctest error message

comment:33 Changed 9 months ago by gh-DaveWitteMorris

  • Status changed from needs_work to needs_review

comment:34 Changed 9 months ago by gh-cEMRSS

I noticed that the current solution does not work in instances such as:

sage: ((x + 2)^2).subs({x + 2:sqrt(x + 2)})
sqrt(x + 2)

I don't know of any direct consequences of this (maybe u-substitution?)

comment:35 Changed 9 months ago by mkoeppe

I think this example just illustrates that the comment # We are not in the problematic case, so we can trust Ginac to do the substitution. is a little bit too optimistic.

I think the documentation of subs should make a clear distinction between the case variable: substitution (which is well defined mathematically) and the case expression: substitution, warn that the latter case is depends on pattern matching and the subtle semantics of the simplification engine - so it should only be used for interactive manipulations but not in production code.

comment:36 Changed 9 months ago by gh-DaveWitteMorris

I'm sorry, I was sure I added exactly that type of warning to the docstring, but it's not there, and it doesn't seem to be in my local branches, so I must have deleted it somehow. That's too bad, because I thought it was pretty good, and had some instructive examples of what could go wrong. I'll write a new warning (if I can't find the old one on my hard drive).

I did know that there were still problems (which is why I tried to add the comment to the docstring) so I certainly agree that the comment is misleading. My PR was intended just to fix the special case of substituting values into variables. I think this (plus a warning in the docstring) is the minimum we need for 9.3.

In the long run, I think there should be two different methods. Part of the problem is that subs is the wrong name for the current functionality. The pynac method is not just doing a substitution, but applying an identity. For example, if you know that x^2 y^2 = x y + 1, then instead of only apply the substitution once, you might want to apply the identity again if more terms have appeared that can be simplified. For example (even with my patch):

sage: cos(cos(cos(cos(cos(x))))).subs({cos(x) : x})
x

It should be easy to upgrade my patch to solve the case in comment:34. If we do that, then I think we should have a keyword argument to enable the original pynac interpretation of subs. However, I should point out that I don't think the simple method used in the patch would be able to handle wildcard variables correctly. (Anyone using wildcard variables can be expected to know what they are doing, so I think that is probably ok.) Is the original PR sufficient, or should we fix comment:34 for 9.3?

comment:37 Changed 8 months ago by mkoeppe

What is implemented on this ticket already is good enough to fix the critical issue with function application,

sage: f(x) = x^2                                                                
sage: f(sqrt(x))                                                                

I think more general tweaks to the subs algorithm should be addressed in pynac.

comment:38 Changed 8 months ago by mkoeppe

  • Priority changed from critical to blocker

comment:39 Changed 8 months ago by git

  • Commit changed from e5359843799eda063abb3090f12a3dcb1aa2b240 to 91bebd0b9de8fb0724216c79a3dde611fdffff67

Branch pushed to git repo; I updated commit sha1. New commits:

91bebd0add warning

comment:40 Changed 8 months ago by gh-DaveWitteMorris

I could not find the warning that I wrote before, so I drafted a new one. Improvements are welcome.

comment:41 Changed 8 months ago by mkoeppe

  • Reviewers set to Matthias Koeppe
  • Status changed from needs_review to positive_review

Looking great, thanks very much for this work.

comment:42 Changed 8 months ago by gh-DaveWitteMorris

Thanks to you, too. My PR just implemented your suggestion in comment:13.

comment:43 Changed 8 months ago by vbraun

  • Status changed from positive_review to needs_work

Docbuild fails, see patchbot

comment:44 Changed 8 months ago by gh-DaveWitteMorris

  • Branch changed from public/30378v2 to public/30378v3

comment:45 Changed 8 months ago by gh-DaveWitteMorris

  • Commit changed from 91bebd0b9de8fb0724216c79a3dde611fdffff67 to 8d4db4686b31902f7c411637431ff349d4de5191
  • Status changed from needs_work to needs_review

Sorry, I goofed up the sphinx syntax of the WARNING block in the docstring. I fixed that, and rebased on 9.3b7.


New commits:

d75ef40fix trac #30378
c15730cupdate doctest error message
14aec0cadd warning
8d4db46wrong syntax for warning

comment:46 Changed 8 months ago by mkoeppe

  • Status changed from needs_review to positive_review

comment:47 Changed 8 months ago by gh-DaveWitteMorris

Thanks!

comment:48 Changed 8 months ago by vbraun

  • Branch changed from public/30378v3 to 8d4db4686b31902f7c411637431ff349d4de5191
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:49 Changed 7 months ago by gh-spaghettisalat

  • Commit 8d4db4686b31902f7c411637431ff349d4de5191 deleted

Unfortunately, this fix causes more embarrassing bugs:

sage: var('a b epsilon')                                                                                                                     
(a, b, epsilon)
sage: (a+b*epsilon).series(epsilon,2).subs(a=a,b=b)                                                                                          
b*epsilon

comment:50 Changed 7 months ago by mkoeppe

Thanks for catching this. I have opened #31530 for this

Last edited 7 months ago by mkoeppe (previous) (diff)
Note: See TracTickets for help on using tickets.