Opened 2 years ago

Closed 21 months ago

# (x^2).subs({x: sqrt(x)}) returns sqrt(x) instead of x

Reported by: Owned by: Eric Gourgoulhon blocker sage-9.3 symbolics sqrt, substitution, pynac Nils Bruin, Ralf Stephan, Karl-Dieter Crisman, Emmanuel Charpentier Dave Morris Matthias Koeppe N/A 8d4db46

### Description (last modified by Nils Bruin)

As discussed in https://groups.google.com/d/msg/sage-devel/kYZ4N73oiCM/-iDyQLWyBAAJ, we have currently

```sage: (x^2).subs({x: sqrt(x)})
sqrt(x)
```

A consequence of this bug is

```sage: f(x) = x^2
sage: f(sqrt(x))
sqrt(x)
```

Note that the following outputs are correct:

```sage: (x^2).subs({x: sqrt(pi)})
pi
sage: (x^2).subs({x: x^(1/3)})
x^(2/3)
sage: y = var('y')
sage: (x^2).subs({x: sqrt(y)})
y
```

Moreover, distinct powers of `x` seem OK:

```sage: (x^3).subs({x: sqrt(x)})
x^(3/2)
sage: (x^(-2)).subs({x: sqrt(x)})
1/x
```

But a match is problematic:

```sage: (x^3).subs({x:x^(1/3)})
x^(1/3)
```

### comment:1 Changed 2 years ago by Eric Gourgoulhon

Cc: Nils Bruin Ralf Stephan added

### comment:2 Changed 2 years ago by Matthias Köppe

Priority: major → critical

### comment:3 Changed 2 years ago by Nils Bruin

Description: modified (diff)

### comment:4 Changed 2 years ago by Nils Bruin

Description: modified (diff)

### comment:5 Changed 2 years ago by Karl-Dieter Crisman

Cc: Karl-Dieter Crisman added

### comment:6 Changed 2 years ago by Benjamin Hackl

I did some debugging and can report the following: this bug happens in https://github.com/pynac/pynac/blob/85c43c230e8050a1f71898889a021fec28318502/ginac/power.cpp#L747; in fact the correct power in the case of the substitution `x=sqrt(x)` in `x^2` is constructed in `p`, but then, for a reason I am not quite sure of, the substitution gets applied to the correct output again (L748) and then compared to the original expression, `x^2`, (L749) where it is decided to return the result with the substitution applied once again and thus `sqrt(x)` is obtained.

Changing the code such that `p` from L747 is returned causes one doctest to fail, namely:

```sage -t src/sage/symbolic/expression.pyx
**********************************************************************
File "src/sage/symbolic/expression.pyx", line 5150, in sage.symbolic.expression.Expression.substitute
Failed example:
t.subs(w0 == w0^2)
Expected:
a^8 + b^8 + (x^2 + y^2)^6
Got:
a^4 + b^4 + (x^2 + y^2)^3
**********************************************************************
1 item had failures:
1 of  68 in sage.symbolic.expression.Expression.substitute
[2871 tests, 1 failure, 34.59 s]
```

(Context: `t = a^2 + b^2 + (x+y)^3` and `w0` is a wildcard.)

I don't really see a good solution right now, maybe someone else has some thoughts?

### comment:7 follow-up:  8 Changed 2 years ago by Karl-Dieter Crisman

I am not sure how to fix this either - presumably it was precisely for some sort of wildcard situation that this was in there - but I do note the following in `expression.pyx`:

```            sage: t = a^2 + b^2 + (x+y)^3

...
sage: t.subs({w0^2: w0^3})
a^3 + b^3 + (x + y)^3

Substitute with one or more relational expressions::

sage: t.subs(w0^2 == w0^3)
a^3 + b^3 + (x + y)^3
```

so far, so good

```            sage: t.subs(w0 == w0^2)
a^8 + b^8 + (x^2 + y^2)^6
```

Wait, what? Shouldn't these be fourth powers? That `x` and `y` get replaced by their squares is fine, as is the sixth power, but I'm not at all sure about the eighth powers. (Also note that the wildcard only found literal squares in the `{w0^2: w0^3}` case, where the cube was not replaced by 1.5 squares - which is presumably good.)

Note that the corresponding Ginac code always returns the `subs_one_level` business in this situation where either the exponent or the base are 'not trivially equal', whatever that means in this context. Unfortunately I can't find out from the commit history where this changed in Pynac - it seems to go back to Burcin's original rewrite back when we changed to Pynac/Ginac in the first place for basic symbolics a decade ago.

Last edited 2 years ago by Karl-Dieter Crisman (previous) (diff)

### comment:8 in reply to:  7 Changed 2 years ago by Benjamin Hackl

so far, so good

```            sage: t.subs(w0 == w0^2)
a^8 + b^8 + (x^2 + y^2)^6
```

Wait, what? Shouldn't these be fourth powers? That `x` and `y` get replaced by their squares is fine, as is the sixth power, but I'm not at all sure about the eighth powers. (Also note that the wildcard only found literal squares in the `{w0^2: w0^3}` case, where the cube was not replaced by 1.5 squares - which is presumably good.)

I can only make sense of the eighth powers in a setting where the substitution acts on a power like `a^2` in such a way that both the base and the exponent get squared. Whether or not this is the intended behavior for a general wildcard substitution like this seems debateable; one could argue that it is actually inconsistent as the exponent of the parenthesis is not treated in the same way.

Unfortunately, `make ptestlong` is currently broken for my system, but as far as I can tell from manual testing, there are no other obvious doctest errors other than this wildcard substitution one.

Note that the corresponding Ginac code always returns the `subs_one_level` business in this situation where either the exponent or the base are 'not trivially equal', whatever that means in this context. Unfortunately I can't find out from the commit history where this changed in Pynac - it seems to go back to Burcin's original rewrite back when we changed to Pynac/Ginac in the first place for basic symbolics a decade ago.

Interesting! I only saw that something concerning `subs` was mentioned in #24262, and that the particular segment around L747 has been introduced in the upgrade on that ticket.

### comment:9 Changed 2 years ago by Dave Morris

In the `w0 == w0^2` example, I suspect that w0 matches a, so a gets squared, but w0 also matches the entire term `a^2`, so it gets squared again: `((a^2)^2)^2 = a^8`. See, for example:

```sage: t.subs(w0 == e^w0)
e^((e^x + e^y)^3) + e^(e^(2*a)) + e^(e^(2*b))
sage: (cos(x)).subs(w0 == e^w0)
e^(cos(e^x))
```

### comment:10 Changed 2 years ago by Matthias Köppe

In the case of substitution where all keys are variables rather than general expressions, shouldn't we be able to write robust code that does not rely on pattern matching?

### comment:11 Changed 2 years ago by Karl-Dieter Crisman

Presumably, but just keep in mind the "not reinventing the wheel" guideline. `w0` in this example is an intentional wildcard. I'm still not 100% sure what is going on with the simpler case that originated the ticket - again, note the default Ginac behavior.

### comment:12 Changed 2 years ago by Matthias Köppe

Several things to note here:

• pynac has flags that are supposed to disable pattern matching (subs_options::no_pattern, https://github.com/pynac/pynac/blob/85c43c230e8050a1f71898889a021fec28318502/ginac/flags.h#L61)
• but `power::subs` does not look at the pattern matching flag
• also Sage's `Expression.substitute` never passes this or any other flags to Pynac's `subs` method
• I think any form of wildcarding (and iterated substitution) should be disabled in `CallableSymbolicExpressionRing_class._call_element_` (I think it's quite shocking that Sage's callable expressions are built upon general subs)
Last edited 2 years ago by Matthias Köppe (previous) (diff)

### comment:13 Changed 2 years ago by Matthias Köppe

A fix for `CallableSymbolicExpressionRing_class._call_element_` (or for a more disciplined version of `substitute` restricted to the case of `{ VARIABLE_i: EXPRESSION_i }` substitution without pattern matching) would be the following:

• check if any `VARIABLE_i` appears in any `EXPRESSION_j`;
• in that case, create unique fresh temporary variables and substitute them first

Higher level library code (such as in `sage.manifold`) will probably not want to use the general form of `subs` at all.

Last edited 2 years ago by Matthias Köppe (previous) (diff)

### comment:14 Changed 2 years ago by Emmanuel Charpentier

#30688 may be germane

Shouldn't this one be a `blocker` (silently returns mathematically false results in (semi-)trivial cases) ?

Last edited 2 years ago by Emmanuel Charpentier (previous) (diff)

### comment:15 Changed 2 years ago by Emmanuel Charpentier

Cc: Emmanuel Charpentier added

### comment:16 Changed 2 years ago by Matthias Köppe

Cc: Emmanuel Charpentier removed

I would agree that this should ideally be a blocker, but unfortunately we don't seem to have many active developers who have time to work on the symbolics code. Comment 13 sketches a possible implementation strategy that can be done purely within sagelib and does not require changes in pynac.

### comment:17 Changed 2 years ago by Matthias Köppe

Cc: Emmanuel Charpentier added

### comment:18 Changed 2 years ago by Emmanuel Charpentier

this remark might or might not be relevant to the current problem.

### comment:19 Changed 2 years ago by Matthias Köppe

Milestone: sage-9.2 → sage-9.3

### comment:20 Changed 2 years ago by Michael Jung

What can I do to help? Unfortunately, my knowledge about Ginac and its interface via SageMath is very limited. If you can give me a doc or something, I can try.

### comment:21 Changed 2 years ago by Matthias Köppe

Comment 13 outlines an implementation strategy that does not require changes to pynac

### comment:22 Changed 2 years ago by Michael Jung

I just looked into the code and I tried to make sense of e.g. `_gobj` and `smap`. But after your comment, I suspect the lines you propose to modify are above, right?

### comment:23 Changed 2 years ago by Matthias Köppe

If you follow comment 13, you would not need to modify `Expression.substitute` at all but instead replace the implementation of `CallableSymbolicExpressionRing_class._call_element_`.

### comment:24 Changed 2 years ago by Dave Morris

Branch: → public/30378

### comment:25 Changed 2 years ago by Dave Morris

Commit: → 2d9bb2bcbe2f64906c703222f72784a85557000a new → needs_review

A similar problem for `(exp(x)).subs(x=ln(x))` was solved in #9891 by patching pynac.

I implemented a fix along the lines suggested by mkoeppe in comment:13. The PR solves the problem in the ticket description, but there is a bug that I don't understand:

```sage: f(x) = x^2
sage: f(sqrt(x))  # #30378 is fixed
x
sage: t,y = var('t y')
sage: g = function('g')
sage: G(x) = integrate(g(t), t, 0, x)
sage: G(y)
integrate(g(t), t, 0, y)
sage: G(x)  # why doesn't the temporary variable get replaced by x?
integrate(g(t), t, 0, symbol180)
sage: G.subs(x=x)  # why doesn't the temporary variable get replaced by x?
x |--> integrate(g(t), t, 0, symbol192)
```

All answers are correct except the last two. I don't know much about cython or pynac, so suggestions are welcome. (For example, I don't know how to clear a GExMap, so I declared new ones, instead of reusing the variable `smap`.)

New commits:

 ​2d9bb2b `fix trac #30378`

### comment:26 Changed 2 years ago by Karl-Dieter Crisman

It's possible that the problem is in Ginac itself too, somehow. As for the new (and definitely bad) bug, sounds like some stuff isn't even being translated back into Sage. If you keep doing `G(x)` does it repeat the same symbol number?

### comment:27 Changed 2 years ago by Dave Morris

If I type `G(x)` over and over, then the symbol number increases each time (e.g., `symbol172`, `symbol184`, `symbol196`, `symbol208`,...).

The first part of my patch seems to be working correctly: a new variable is created, and gets substituted in for `x`. But it seems that (for reasons I do not understand) this new variable does not match when we make the second substitution, so it does not get replaced, and stays in the expression. (As partial confirmation, `G(x^2)` yields `integrate(g(t), t, 0, symbol254)`.) This problem does not arise for ordinary symbolic expressions (such as `sin(x).subs(x=x^2)`), but it seems that `integrate` messes things up somehow.

If I type `G(x).variables()`, I get `(symbol266, t)` (but perhaps with a different symbol number), so everything seems fine except that I would like to be getting `x`, instead of `symbol242`.

### comment:28 Changed 2 years ago by Dave Morris

Authors: → Dave Morris

### comment:29 Changed 2 years ago by Dave Morris

Status: needs_review → needs_work

### comment:30 Changed 2 years ago by Emmanuel Charpentier

Possibly related : #31137

### comment:31 Changed 22 months ago by Dave Morris

Branch: public/30378 → public/30378v2

### comment:32 Changed 22 months ago by Dave Morris

Commit: 2d9bb2bcbe2f64906c703222f72784a85557000a → e5359843799eda063abb3090f12a3dcb1aa2b240

The bug in my previous PR was that the first substitution could trigger a simplification/evaluation, and this could result in a call to maxima. But maxima does not understand ginac's anonymous temporary variables:

```sage: a = SR.symbol()
sage: a  # we have a new variable
symbol195
sage: maxima(a)  # it can be sent to maxima
_SAGE_VAR_symbol195
sage: bool(maxima(a).sage() == a)  # but it comes back to sage as a different variable
False
```

The expression came back with a different variable than we expected, so the substitution could not match this unexpected new variable.

A follow up ticket could fix the bug properly, by adding a function to pynac that performs a straightforward substitution, without pattern matching or other complications.

PS This bug fix does not seem to be related to #31137 (which was fixed by the patch to pynac in #30446), and, unfortunately, it also does not seem to help with #30688 (which is still open).

New commits:

 ​ee8bfce `fix trac #30378` ​e535984 `update doctest error message`

### comment:33 Changed 22 months ago by Dave Morris

Status: needs_work → needs_review

### comment:34 Changed 22 months ago by gh-cEMRSS

I noticed that the current solution does not work in instances such as:

```sage: ((x + 2)^2).subs({x + 2:sqrt(x + 2)})
sqrt(x + 2)
```

I don't know of any direct consequences of this (maybe u-substitution?)

### comment:35 Changed 22 months ago by Matthias Köppe

I think this example just illustrates that the comment `# We are not in the problematic case, so we can trust Ginac to do the substitution.` is a little bit too optimistic.

I think the documentation of `subs` should make a clear distinction between the case `variable: substitution` (which is well defined mathematically) and the case `expression: substitution`, warn that the latter case is depends on pattern matching and the subtle semantics of the simplification engine - so it should only be used for interactive manipulations but not in production code.

### comment:36 Changed 22 months ago by Dave Morris

I'm sorry, I was sure I added exactly that type of warning to the docstring, but it's not there, and it doesn't seem to be in my local branches, so I must have deleted it somehow. That's too bad, because I thought it was pretty good, and had some instructive examples of what could go wrong. I'll write a new warning (if I can't find the old one on my hard drive).

I did know that there were still problems (which is why I tried to add the comment to the docstring) so I certainly agree that the comment is misleading. My PR was intended just to fix the special case of substituting values into variables. I think this (plus a warning in the docstring) is the minimum we need for 9.3.

In the long run, I think there should be two different methods. Part of the problem is that `subs` is the wrong name for the current functionality. The pynac method is not just doing a substitution, but applying an identity. For example, if you know that `x^2 y^2 = x y + 1`, then instead of only apply the substitution once, you might want to apply the identity again if more terms have appeared that can be simplified. For example (even with my patch):

```sage: cos(cos(cos(cos(cos(x))))).subs({cos(x) : x})
x
```

It should be easy to upgrade my patch to solve the case in comment:34. If we do that, then I think we should have a keyword argument to enable the original pynac interpretation of `subs`. However, I should point out that I don't think the simple method used in the patch would be able to handle wildcard variables correctly. (Anyone using wildcard variables can be expected to know what they are doing, so I think that is probably ok.) Is the original PR sufficient, or should we fix comment:34 for 9.3?

### comment:37 Changed 22 months ago by Matthias Köppe

What is implemented on this ticket already is good enough to fix the critical issue with function application,

```sage: f(x) = x^2
sage: f(sqrt(x))
```

I think more general tweaks to the subs algorithm should be addressed in pynac.

### comment:38 Changed 22 months ago by Matthias Köppe

Priority: critical → blocker

### comment:39 Changed 22 months ago by git

Commit: e5359843799eda063abb3090f12a3dcb1aa2b240 → 91bebd0b9de8fb0724216c79a3dde611fdffff67

Branch pushed to git repo; I updated commit sha1. New commits:

 ​91bebd0 `add warning`

### comment:40 Changed 22 months ago by Dave Morris

I could not find the warning that I wrote before, so I drafted a new one. Improvements are welcome.

### comment:41 Changed 22 months ago by Matthias Köppe

Reviewers: → Matthias Koeppe needs_review → positive_review

Looking great, thanks very much for this work.

### comment:42 Changed 22 months ago by Dave Morris

Thanks to you, too. My PR just implemented your suggestion in comment:13.

### comment:43 Changed 22 months ago by Volker Braun

Status: positive_review → needs_work

Docbuild fails, see patchbot

### comment:44 Changed 22 months ago by Dave Morris

Branch: public/30378v2 → public/30378v3

### comment:45 Changed 22 months ago by Dave Morris

Commit: 91bebd0b9de8fb0724216c79a3dde611fdffff67 → 8d4db4686b31902f7c411637431ff349d4de5191 needs_work → needs_review

Sorry, I goofed up the sphinx syntax of the `WARNING` block in the docstring. I fixed that, and rebased on 9.3b7.

New commits:

 ​d75ef40 `fix trac #30378` ​c15730c `update doctest error message` ​14aec0c `add warning` ​8d4db46 `wrong syntax for warning`

### comment:46 Changed 22 months ago by Matthias Köppe

Status: needs_review → positive_review

Thanks!

### comment:48 Changed 21 months ago by Volker Braun

Branch: public/30378v3 → 8d4db4686b31902f7c411637431ff349d4de5191 → fixed positive_review → closed

### comment:49 Changed 21 months ago by Marius Gerbershagen

Commit: 8d4db4686b31902f7c411637431ff349d4de5191

Unfortunately, this fix causes more embarrassing bugs:

```sage: var('a b epsilon')
(a, b, epsilon)
sage: (a+b*epsilon).series(epsilon,2).subs(a=a,b=b)
b*epsilon
```

### comment:50 Changed 21 months ago by Matthias Köppe

Thanks for catching this. I have opened #31530 for this

Last edited 21 months ago by Matthias Köppe (previous) (diff)
Note: See TracTickets for help on using tickets.