Opened 9 years ago

Closed 14 months ago

#12406 closed defect (fixed)

solve_left and solve_right should use coercion

Reported by: robertwb Owned by:
Priority: major Milestone: sage-9.1
Component: linear algebra Keywords:
Cc: mjo Merged in:
Authors: Markus Wageringel Reviewers: Michael Orlitzky
Report Upstream: N/A Work issues:
Branch: 93cea85 (Commits, GitHub, GitLab) Commit: 93cea8547c304a48347f25a4120af4cf71e2a6ef
Dependencies: Stopgaps:

Status badges

Description (last modified by gh-mwageringel)

This ticket uses coercion to find suitable parents for the arguments to Matrix.solve_right in order to solve these problems:

sage: A = matrix(QQ, 2, [1, 2 ,3, 4])
sage: b = vector(RDF, [pi, e]); b
(3.14159265359, 2.71828182846)
sage: A.solve_right(b)   # should return (-3.564903478720541, 3.353248066155167)
(-27594871646705519/7740706532848242, 17304339474632025/5160471021898828)
sage: R.<x> = QQ[]
sage: b = vector(R, [1, x]); b
(1, x)
sage: A.solve_right(b)   # should return (t - 2, -1/2*t + 3/2)
[ugly traceback]
TypeError: not a constant polynomial

This is implemented by moving the coercion code from Matrix_double_dense.solve_right, that was implemented in #17405, to the super class. As a result, Matrix_double_dense.solve_right is redundant and therefore removed.

The super method Matrix.solve_right is refactored a lot:

  • the check parameter is ignored for inexact rings (see also #13932)
  • the implementation is changed so that the rank of the matrix is not computed in solve_right, but only in _solve_right_nonsingular_square; this way, inexact rings overwriting the latter method can avoid computing the rank (which would not work over inexact rings)

Additionally, this ticket adds support for calling solve_right/solve_left with non-square matrices over RDF/CDF. In such cases, solve_right returns the least-squares solution of the equation system, similar to the Matlab/Octave backslash operator (which our documentation already claims is implemented). This is implemented using scipy.linalg.lstsq.

sage: A = matrix(RDF, 3, 2, [1, 3, 4, 2, 0, -3])
sage: b = vector(RDF, [5, 6, 1])
sage: x = A.solve_right(b)
sage: (A * x - b).norm()
3.2692119900020438
sage: x == A \ b
True

Change History (35)

comment:1 Changed 9 years ago by robertwb

  • Type changed from PLEASE CHANGE to defect

comment:2 Changed 8 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:3 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:4 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:5 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:6 Changed 16 months ago by gh-mwageringel

  • Authors set to Markus Wageringel
  • Branch set to u/gh-mwageringel/12406
  • Commit set to 0ccb65a3c8ee5b4bce5aea20a8d85ea550aa608f
  • Dependencies set to #17405
  • Description modified (diff)
  • Milestone changed from sage-6.4 to sage-9.1
  • Status changed from new to needs_review

Based on #17405.


New commits:

95e4157Fix minor typo.
8c41acfFix hidden bug in RDF/CDF matrix multiplication dead error-checking.
5844bbeChange RDF/CDF matrix multiplication to accept matrices for B.
fab2bcdImprove exceptions
bd6e4fcMerge tag '9.1.beta4' into #17405
7b07a2717405: fix solve_right and solve_left over RDF and CDF
0ccb65a12406: implement coercion for generic solve_right

comment:7 Changed 16 months ago by gh-mwageringel

  • Owner changed from jason, was to (none)

comment:8 follow-up: Changed 16 months ago by gh-mwageringel

  • Status changed from needs_review to needs_work

A pynormaliz doctest fails in src/sage/geometry/polyhedron/library.py.

comment:9 Changed 16 months ago by git

  • Commit changed from 0ccb65a3c8ee5b4bce5aea20a8d85ea550aa608f to a815ff9d85be5a22dd3245ed8941e862537cff23

Branch pushed to git repo; I updated commit sha1. New commits:

a815ff912406: explicitly convert to AA in generalized_permutahedron

comment:10 in reply to: ↑ 8 Changed 16 months ago by gh-mwageringel

  • Status changed from needs_work to needs_review

Replying to gh-mwageringel:

A pynormaliz doctest fails in src/sage/geometry/polyhedron/library.py.

The failing test involved a solve_right operation between a matrix over AA and a vector over QuadraticField(2). The latter does not coerce into AA, so this is solved by explicitly converting the entries of the vector to AA.

comment:11 follow-up: Changed 16 months ago by mjo

  • Cc mjo added
  • Reviewers set to Michael Orlitzky

This looks like a nice improvement, thanks.

Pet peeve: these docstrings don't say what we're solving for:

Compute a solution to the equation A X = B

The next line introduces A, but we can make it a tiny bit better IMO by saying something like

Compute one solution X to the equation A X = B

or

Try to find a solution X to the system A X = B

Those additionally imply that only one solution is returned, whereas "solve..." to me suggests that maybe an entire subspace (or a matrix whose columns form a basis for a subspace) is returned. For example, solving 0 X = 0 returns (0,0,0), which is not worth full credit =)

I'm rebuilding sage for another ticket right now but I'll take a closer look at this soon.

comment:12 Changed 16 months ago by git

  • Commit changed from a815ff9d85be5a22dd3245ed8941e862537cff23 to 007deac61554cbc1aefcda0f8625ebd5a758d20d

Branch pushed to git repo; I updated commit sha1. New commits:

007deac12406: improve description of solve_right/solve_left

comment:13 in reply to: ↑ 11 Changed 16 months ago by gh-mwageringel

Replying to mjo:

Try to find a solution X to the system A X = B

Those additionally imply that only one solution is returned, whereas "solve..." to me suggests that maybe an entire subspace (or a matrix whose columns form a basis for a subspace) is returned. For example, solving 0 X = 0 returns (0,0,0), which is not worth full credit =)

Thanks for the suggestion. I have updated the descriptions accordingly.

comment:14 follow-up: Changed 16 months ago by mjo

  • Status changed from needs_review to needs_work

At a high level, this looks like the right approach to me. The double_dense matrices are the only ones that override the solve_left and solve_right methods, so moving the coercion up and refactoring is what I would have done too. As a result, I only found some minor issues.

Fixes that I think will be uncontroversial:

  1. When writing norms in the docstrings, a backslash is needed before the pipe:
    -        solution of the equation, such that the 2-norm `|A x - b|`
    +        solution of the equation, such that the 2-norm `\|A x - b\|`
    
  1. The new error message "ValueError?: number of rows of self must equal degree of B" uses the wrong variable name (capital B versus b), and doesn't make sense when b is a matrix (and has no degree).
  1. Instead of indirect doctests like
    sage: x = A.solve_right(b)  # indirect doctest
    
    you can use direct ones like
    sage: x = A._solve_right_nonsingular_square(b)
    
    That avoids the potential problem where some superclass behavior changes, and the indirect doctest no longer checks what we think it checks.
  1. The recursive call to e.g. solve_right in
    return A.solve_right(B, check=check)
    
    is elegant, but is probably inefficient because Python doesn't make much effort to optimize recursive calls (see e.g. http://neopythonic.blogspot.com/2009/04/final-words-on-tail-calls.html). In this case, I think you can set A = self at the top of the function, and just let that case fall through. The rest of the function would have to be updated to use A instead of self, though.

Other suggestions:

  1. I think it's easier for non-native speakers if you write out "that is" or "in other words" instead of "i.e."
  2. In the example "Over inexact rings, the output of this function may not be an exact solution...", we could compare the answer to the closed-form solution of least squares, to convince the reader that it really gives a least-squares solution:
    sage: A.solve_right(b)
    (1.4782608695652177, 0.35177865612648235)
    sage: ~(A.T*A)*A.T*b # closed form
    (1.478260869565217, 0.351778656126482)
    
  3. I think a tolerance of 1e-14 is probably pushing our luck -- Sage is built on a lot of unusual hardware. NumPy? itself uses something a bit more clever, but for example, its default absolute tolerance is 1e-8. Ultimately, this probably doesn't matter, since we can always make the tolerance bigger if the tests fail on someone's machine.
  4. The code that checks to see if we need to change rings...
    if L is not P:
      B = B.change_ring(P)
    if K is not P:
      A = self.change_ring(P)
    
    may be redundant, since the first thing change_ring does is check to see if it can do nothing. However, avoiding the method call entirely may improve performance a tiny bit.
  5. The docstrings for the dense matrices use lowercase x and b, unlike the generic classes.

Miscellaneous:

  1. The "\ZZ" doesn't look right to me in the HTML documentation, but this is a wider problem and not just in your changes. It renders as a bright red, literal "\ZZ" for me, which looks especially weird when \ZZ/n\ZZ renders just like that.

comment:15 Changed 16 months ago by git

  • Commit changed from 007deac61554cbc1aefcda0f8625ebd5a758d20d to 828d9db334d043a7fefe13ca854203d62b5d7ca5

Branch pushed to git repo; I updated commit sha1. New commits:

828d9db12406: implement reviewer suggestions

comment:16 in reply to: ↑ 14 Changed 16 months ago by gh-mwageringel

  • Status changed from needs_work to needs_review

Thank you for the detailed feedback. I have implemented most of the suggestions.

Replying to mjo:

  1. The new error message "ValueError?: number of rows of self must equal degree of B" uses the wrong variable name (capital B versus b), and doesn't make sense when b is a matrix (and has no degree).

I have replaced B by right-hand side to avoid the inconsistent capitalization. For matrices, the error message actually refers to the number of rows rather than the degree.

  1. The recursive call to e.g. solve_right in
    return A.solve_right(B, check=check)
    
    is elegant, but is probably inefficient because Python doesn't make much effort to optimize recursive calls (see e.g. http://neopythonic.blogspot.com/2009/04/final-words-on-tail-calls.html). In this case, I think you can set A = self at the top of the function, and just let that case fall through. The rest of the function would have to be updated to use A instead of self, though.

The base ring of A is different from the one of self, so the type of A can be different as well. The class Matrix_double_dense, for instance, overwrites solve_right to set the keyword check to False, so it is necessary to call solve_right on the new matrix A in order to evaluate the implemention of solve_right that corresponds to the type of A. For example, this can happen in:

matrix.random(ZZ, 3).solve_right((RDF^3).random_element())

Also, this is only about saving one or two method calls, which is probably negligible compared to solving a linear system. Tail call optimization is more relevant in case of recursive loops, where the additional overhead compared to while-loops becomes apparent.

  1. In the example "Over inexact rings, the output of this function may not be an exact solution...", we could compare the answer to the closed-form solution of least squares, to convince the reader that it really gives a least-squares solution:

Very good idea.

  1. I think a tolerance of 1e-14 is probably pushing our luck -- Sage is built on a lot of unusual hardware. NumPy? itself uses something a bit more clever, but for example, its default absolute tolerance is 1e-8. Ultimately, this probably doesn't matter, since we can always make the tolerance bigger if the tests fail on someone's machine.

I would prefer to keep the tolerance small for now, as one could consider it a bug if the error gets too large. If indeed the tests fail on some machines, we can reconsider.

  1. The code that checks to see if we need to change rings...
    if L is not P:
      B = B.change_ring(P)
    if K is not P:
      A = self.change_ring(P)
    
    may be redundant, since the first thing change_ring does is check to see if it can do nothing. However, avoiding the method call entirely may improve performance a tiny bit.

Only for immutable matrices, change_ring with identical rings is a no-op – otherwise, a copy is returned.

  1. The docstrings for the dense matrices use lowercase x and b, unlike the generic classes.

The argument for Matrix_double_dense.solve_right is called b rather than B. Changing it would not strictly be backward compatible, so I avoided renaming it, despite the inconsistency with the super method.

  1. The "\ZZ" doesn't look right to me in the HTML documentation, but this is a wider problem and not just in your changes. It renders as a bright red, literal "\ZZ" for me, which looks especially weird when \ZZ/n\ZZ renders just like that.

I am not sure what could be causing this problem. On my end, it renders correctly as a bold Z.

comment:17 follow-up: Changed 16 months ago by mjo

I am not sure what could be causing this problem. On my end, it renders correctly as a bold Z.

This is probably a problem with my machine(s). I have a long history of HTML documentation problems.

Also, this is only about saving one or two method calls, which is probably negligible compared to solving a linear system.

Sure, but without the TCO it keeps another stack frame around for the entire method call. This isn't a big deal in 2020, but since solving O(10000) linear systems is common even in "small" homework exercises, we might as well take the free lunch. It should even simplify the code, now that I have remembered that self isn't magic in python and is just a normal local variable. And another tiny improvement would make this slightly more beneficial:

First, how about we move the check on the dimensions of B to the top? There's no point in finding a common parent ring if the method is going to raise an error anyway. That is,

b_is_vec = is_Vector(B)
if b_is_vec:
    if self.nrows() != B.degree():
        raise ValueError("...")
else:
    if self.nrows() != B.nrows():
        raise ValueError("...")

(rest of method)

Then, my idea was to replace

A = self.change_ring(P)
return A.solve_right(B, check=check)

by

self = self.change_ring(P)

Afterwards, the method proceeds normally, but with a matrix self that has a compatible ring. That's what the recursive call would ultimately have done anyway.

That will skip the extra method call, skip the checks on the dimensions of B (if you move them to the top), and skip a redundant check for the compatibility of base rings. Since it deletes a line and gives us a tiny performance benefit, we might as well take it, right?

comment:18 Changed 16 months ago by git

  • Commit changed from 828d9db334d043a7fefe13ca854203d62b5d7ca5 to 6d0bbf6fe6c91fe929e305c9761d249b94f517b9

Branch pushed to git repo; I updated commit sha1. New commits:

6d0bbf612406: add a doctest to check consistency of coercion in solve_right

comment:19 in reply to: ↑ 17 Changed 16 months ago by gh-mwageringel

Replying to mjo:

Then, my idea was to replace

A = self.change_ring(P)
return A.solve_right(B, check=check)

by

self = self.change_ring(P)

Afterwards, the method proceeds normally, but with a matrix self that has a compatible ring. That's what the recursive call would ultimately have done anyway.

This is not equivalent, as it conflicts with subclasses overwriting solve_right. I have added a doctest where this makes a difference:

sage: A = matrix(ZZ, [[1, 2, 3], [2, 0, 2], [3, 2, 5]])
sage: b = vector(RDF, [1, 1, 1])

Now the following should be equal, but this would fail with the changes you suggested:

sage: A.solve_right(b) == A.change_ring(RDF).solve_right(b)
...
True

Here, A is of type Matrix_integer_dense, but A.change_ring(RDF) is of type Matrix_double_dense. The latter class overwrites solve_right, so we must call Matrix_double_dense.solve_right of the subclass after changing rings, instead of continuing execute the code from the super class.

The coercion code can be thought of as a normalization procedure of the arguments before executing the main part of the method. It could be refactored into a decorator for matrix/vector operations, very similar to coerce_binop. Note that the coerce_binop decorator also uses method resolution to call the correct method implementation after coercing the arguments, rather than invoking the given method directly.

            a, b = coercion_model.canonical_coercion(self, other)
            if a is self:
                return method(a, b, *args, **kwargs)
            else:
                return getattr(a, method.__name__)(b, *args, **kwargs)

comment:20 Changed 16 months ago by mjo

All other concerns aside, we should now move the self.rank() computation up before the coercion. Rank computations in Sage aren't numerically stable, and while that's a bug, it's been around for 10+ years. For example:

age: A = matrix(ZZ, [ [3,5], [30,50] ])
sage: A.rank()
1
sage: C = A.change_ring(RDF)
sage: C.rank()
2
sage: A == C
True

So, we should use the most exact ring possible to compute the rank of self in

if not self.is_square() or ((check or K.is_exact())
                            and self.rank() != self.nrows()):

Otherwise, we risk choosing the wrong branch.

I also think we should change...

if K not in _Fields:
    K = K.fraction_field()
    self = self.change_ring(K)

to

if K not in _Fields:
    K = K.fraction_field()
    self = self.change_ring(K)
    B = B.change_ring(K)

for consistency, so that we always call either _solve_right_general or _solve_right_nonsingular_square with arguments that have the same base ring.

Finally, I get where you're going with this...

sage: A.solve_right(b) == A.change_ring(RDF).solve_right(b)
True

but the reason it fails with my suggested changes is because one side produces the correct answer with my suggested changes. In your branch, both sides produce the same wrong answer =)

Unless we can make it produce the correct answer starting from RDF, I don't think we should add a doctest that ensures wrong answers in both cases! What's happening is that the matrix

A = matrix(ZZ, [[1, 2, 3], [2, 0, 2], [3, 2, 5]])

is singular (by inspection), so it should wind up using the generic solve method that uses least squares. However, in your branch, the subclass method calls the superclass method with check=False, so when we get to the line

if not self.is_square() or ((check or K.is_exact())
                            and self.rank() != self.nrows()):

the "if" fails, and we try to solve the system as if A was nonsingular, and it isn't. So, you get a junk answer, whereas with least squares, you get a decent one. With my suggestion, the subclass method is skipped entirely, causing check=True by default. This is wrong in that we shouldn't bypass subclass methods if they exist, but I'm not 100% sure that it's wrong in this case when the sole purpose of the subclass method is to set check=False. The more I think about this, the more nonsensical it seems to me to have a check parameter in the first place when least squares can be used as in the MATLAB backslash operator.

I'm not expecting you to fix a decade old broken API... I'm just trying to be thorough and don't want to make things worse. What's the reasoning behind setting check=False in the subclass method, and behind adding the check variable to this condition?

if not self.is_square() or ((check or K.is_exact())
                            and self.rank() != self.nrows()):

A priori, it's weird to have a check parameter change the answer, and setting it to false in the subclass method means that the generic solve will never get used for square RDF matrices, even if they're singular(ish). I'm only guessing at the implementation details in scipy, but it looks like the least squares approach is better for matrices that are almost singular. Particularly if we compute the rank before coercing, I think we should trust the fact that (rank < rows) and use least squares in that case.

comment:21 follow-up: Changed 16 months ago by mjo

Follow-up: I do see that the documentation for the subclass method promises different things for square and non-square matrices, but I don't think that's consistent any longer, for two reasons. The first is simple: crashing on a singular square matrix made sense before least-squares was implemented, but that's obsolete now. The second is that if I start with

A = matrix(ZZ, ...)
b = vector(RDF, ...)

and then try to figure out what's going to happen from

A.solve_right?

I'm not going to see the subclass documentation, so we shouldn't silently convert to a subclass and then implement the subclass method to do something different.

comment:22 in reply to: ↑ 21 Changed 16 months ago by mjo

Replying to mjo:

Follow-up: I do see that the documentation for the subclass method promises different things for square and non-square matrices, but I don't think that's consistent any longer

Scratch that, I see that the generic matrix code disagreed with its documentation all along. The MATLAB backslash operator doesn't check the rank, it only looks at the dimensions and does a QR solve regardless of the rank for a non-square matrix. It insists that square systems actually be solvable. So I would expect an error from A = matrix(ZZ, [[1, 2, 3], [2, 0, 2], [3, 2, 5]]), if we really implemented the backslash properly. But we don't, because the check parameter directly contradicts it. This is a nightmare! I'm going to play around a bit more but then probably throw up my hands in frustration and stop wasting your time.

comment:23 Changed 16 months ago by mjo

Ok, I finally have my head wrapped around this.

The behavior promised by the solve_right() method is just fundamentally incompatible with inexact rings, and also with the promise that it makes to implement the MATLAB backslash operator. The existing code uses the "general" solver for rank-deficient matrices, which isn't how the MATLAB backslash works. That behavior is documented and doctested throughout sage, so it's difficult to change.

In hindsight, the reason for setting check=False and adding the (check or K.is_exact()) condition is clear: it let's you implement the backslash operator correctly for RDF and CDF matrices by hacking the superclass method to choose the "wrong" branch. Normally I would say this is a horrendous design, but you can only work with what you're given.

I made a post to the -devel list asking about the behavior of solve_right, but I don't expect to get an easy answer (if this was fun/easy to fix, it wouldn't look like this). It's apparent that to truly make everything consistent, we will have to change either the superclass documentation or its implementation. And changing the implementation (for ALL matrices) at this point is more than either of us have bargained for.

So, here's the best solution that I was able to come up with. Instead of changing the superclass implementation, we change the documentation to admit that (a) we don't really implement the backslash the way MATLAB does, specifically with respect to rank-deficient matrices; and (b) the check parameter will not be used over inexact rings. Item (a) is a given in the superclass, and I think (b) is justified because all solutions are approximate over inexact rings. Propagating item (a) into the matrix_double_dense subclass is a "breaking" change, but one that seems only to improve the answers we get.

Here's the summary:

  1. At the beginning of Matrix.solve_right, we set check=False whenever the ring is inexact.
  2. Since check is now False for RDF/CDF matrices, there's no reason for the subclass methods to force check=False, and therefore no reason for the subclass methods at all! We can drop them and move the documentation/examples into the superclass, to eliminate all of the weirdness that results from having two different sets of documentation on classes that silently convert into one another.
  3. The coercion and everything else proceeds basically as you had it, except without calling solve_right again, because now there is no subclass method to call.
  4. Instead of hacking the "if" statement to use the square-nonsingular solve for all square RDF/CDF matrices, we leave the logic alone and let sage choose to use least squares in that case. This breaks some doctests, but in reality it improves many of the answers. Some LinAlgErrors? become near-solutions, and some ill-conditioned systems produce much better solutions this way.

This doesn't attempt to implement the backslash exactly, but it does solve systems in a consistent(ish) manner regardless of whether or not the ring is exact. It also passes all of the doctests that don't refer specifically to the behavior that was changed on purpose.

I've pushed a branch so you can see my proof-of-concept:

https://git.sagemath.org/sage.git/diff?id=d59671100187b42c112f9a89c058dc6cbd1e4543

Please, take a look and see what you think. If you think it's a better approach, we can clean it up. If not... I tried. I'll give this a positive review if you prefer the original approach.

comment:24 Changed 16 months ago by mjo

That branch is u/mjo/ticket/12406 if you'd rather look at it locally, by the way.

comment:25 follow-up: Changed 16 months ago by gh-mwageringel

As I mentioned on sage-devel, I would prefer an implementation for inexact rings where square matrices and non-square matrices are treated differently. Contrary to what I said on devel though, I now realize that this is not consistent with exact rings: if a square matrix is singular to machine precision, an error is raised regardless of whether a solution exists or not, whereas over exact rings an error is raised only if a solution does not exist.

Since we cannot in general detect such a degenerate situation over inexact rings however, this seems like the price we have to pay for inexact computations. Still, it seems important to get a predictable answer (the algorithm should not depend on numerical noise), so I would prefer to compute a least-squares solution only in the non-square case. This agrees with Matlab as well as SciPy treating the square situation differently. Getting a warning about a square matrix being singular can be a useful answer too, but it needs to be documented that this is different than in the exact case. Since the method is effectively broken for inexact rings currently, I do not expect these changes to cause many problems.

I agree that the check keyword should be ignored over inexact rings, and that the way I used check was bad. I tried to circumvent the rank computation for RDF/CDF by that, because computing the rank cannot work over inexact rings. I would have just written the if-statement as

if not self.is_square() or K.is_exact() and self.rank() != self.nrows():

if it was not for a test failure in src/sage/quivers/morphism.py involving RR, an inexact ring different from RDF/CDF. This is something that #13932 will hopefully resolve eventually.

The more I think about it though, it seems to me that the rank computation should be avoided even for exact rings. Computing the rank will usually involve some sort of echelonization I suppose which is already a major part of the work required to find a solution of a square system. Instead, we should remove the rank computation from the generic solve_right method entirely and leave it to the implementation of _solve_right_nonsingular_square to decide (possibly using the check_rank keyword) whether computing the rank is necessary or appropriate, depending on the ring or type of the matrix as well as on the algorithm used for solving. If appropriate (e.g. for an exact ring), _solve_right_nonsingular_square should signal a specific error in case of a singular matrix that would be caught in solve_right, so that the _solve_right_general can be tried instead. This gives all the control to the class implementing _solve_right_nonsingular_square as to which kind of solution will be computed.


A couple of other thoughts:

I still think it would be cleaner to call solve_right again after coercion, even if no subclass overwrites it, as I do not think there is much practical relevance for saving a stack frame. The main reason for calling solve_right again is that this is closer to how coercion is usually used in Sage (which is also relevant for the next point), but I will not insist on this if indeed we remove the subclass method.


Replying to mjo:

All other concerns aside, we should now move the self.rank() computation up before the coercion. Rank computations in Sage aren't numerically stable

Indeed, this would be better for the rank computation, but it is different from how the coercion system usually works in Sage, so could lead to unexpected results. To the best of my knowledge, usually, the coercion system first searches for a common parent of two operands and only then invokes the appropriate binary operation on them. This is why the doctest

A.solve_right(b) == A.change_ring(RDF).solve_right(b)

is justified, as it merely asserts that the answer is the same in either case. Because of that, I think it would be better to avoid using information obtained from the old parent, even if this is counter-intuitive. If we move the rank computation to _solve_right_nonsingular_square, this does not matter anyway.


I also think we should change...

if K not in _Fields:
    K = K.fraction_field()
    self = self.change_ring(K)

to

if K not in _Fields:
    K = K.fraction_field()
    self = self.change_ring(K)
    B = B.change_ring(K)

for consistency, so that we always call either _solve_right_general or _solve_right_nonsingular_square with arguments that have the same base ring.

I agree. Ideally, it should be part of the coercion code (except in the non-integral-domain case), so changing rings happens only once.


Replying to mjo:

The second is that if I start with

A = matrix(ZZ, ...)
b = vector(RDF, ...)

and then try to figure out what's going to happen from

A.solve_right?

I'm not going to see the subclass documentation, so we shouldn't silently convert to a subclass and then implement the subclass method to do something different.

Indeed, this is not optimal. The superclass documentation should state that coercion is used on the arguments, and what the subclass method does (if it exists) should be consistent with what the superclass documentation says.


There are two test failures with your branch.

File "src/sage/matrix/matrix_complex_ball_dense.pyx", line 553, in sage.matrix.matrix_complex_ball_dense.Matrix_complex_ball_dense._solve_right_nonsingular_square
Failed example:
    matrix(CBF, 2, 2, 0) \ vector([-1, 1])
Expected:
    Traceback (most recent call last):
    ...
    ValueError: matrix equation has no solutions
Got:
    (0, 0)

I am not sure how solve_right is implemented for CBF, but it is another case we may need to consider. This is interesting because, despite being inexact, for some systems over CBF it may be possible to decide with certainty whether or not a solution exists.

The other failure is the peculiar doctest in src/sage/quivers/morphism.py. It is caused by lifting a nontrivial element along the zero-morphism of a vector space over RR. A preimage does not exist, which cannot reliably tested over RR, so in this case I think the doctest is wrong in using RR rather than an exact ring like QQ as everywhere else in that file. Since lift is implemented using solve_right, this means that, if solve_right produces a poor approximation over an inexact ring, the result of lift will be just as bad.


I will try to see if I can make changes as outlined above if you agree.

comment:26 in reply to: ↑ 25 Changed 16 months ago by mjo

Replying to gh-mwageringel:

There are two test failures with your branch... in this case I think the doctest is wrong in using RR rather than an exact ring like QQ as everywhere else in that file.

Indeed, and I think the CBF doctest is wrong as well. It uses an indirect doctest instead of calling _solve_right_nonsingular_square, so it's one of those cases where a change in the superclass behavior means that the test isn't checking what it's supposed to check. But these may turn out to be irrelevant so I don't want to worry too much about it.

The more I think about it though, it seems to me that the rank computation should be avoided even for exact rings. Computing the rank will usually involve some sort of echelonization I suppose which is already a major part of the work required to find a solution of a square system.

Excellent point! I'm embarrassed that I stared at that rank() call so long without realizing it.

In light of that, I agree with everything else you said. For inexact rings, we should go with the MATLAB behavior, even if it disagrees with the exact-ring behavior, because you can't know if the system is rank-deficient without first solving the problem. For backwards compatibility, we have to keep the exact-ring behavior roughly as it is, and document the difference. If we can try the square-nonsingular method first and have it fall back to the generic solve when the matrix isn't full rank, then that sounds reasonable to me.

In the long term, I think it would be a better design to decree that subclasses should not override solve_right, and should instead only override _solve_right_nonsingular_square and _solve_right_general. That would still be in the spirit of the coercion framework (modulo the non-integral-domain stanza), in that it coerces the arguments and then "immediately" dispatches to the appropriate subclass method. In that way we avoid people implementing weird solve_right methods in subclasses whose documentation no one will see. I won't give you a hard time about this though. You're doing all the work, so do whatever you think is best.

comment:27 Changed 15 months ago by gh-mwageringel

  • Description modified (diff)

comment:28 Changed 15 months ago by gh-mwageringel

  • Branch changed from u/gh-mwageringel/12406 to u/gh-mwageringel/12406v2
  • Commit changed from 6d0bbf6fe6c91fe929e305c9761d249b94f517b9 to 99c2a7e7046c5e621198a4b71d3cd8f67e29a22a
  • Dependencies #17405 deleted

I have updated the branch accordingly. The rank computation is now done in _solve_right_nonsingular_square only, and the backslash operator now works as in Matlab and Octave, so that a least-squares solution is only computed in the non-square case. For exact rings, these changes should not make a difference.

Additionally, I have added a doctest for the problem in #13932, which is solved by this branch since the check parameter is now ignored over inexact rings.


New commits:

d751e78Trac # 12406: document that "check" means nothing over inexact rings.
71f0c28Trac #12406: absorb RDF/CDF solve_* into the superclass methods.
9aa2eb3Trac #12406: unbreak doctests for exact rings.
988e21512406: move rank checks to _solve_right_nonsingular_square
99c2a7e12406: test that `check` is ignored over inexact rings

comment:29 Changed 15 months ago by mjo

Instead of testing the string contents of the error message, what do you think about adding a new NotFullRankError in matrix2.py to be thrown in that case? The documentation could say something like "The fact that a square system is rank-deficient sometimes only becomes apparent while attempting to solve it. The solve_left and solve_right methods defer to _solve_right_nonsingular_square for square systems, and that method raises this error if the system turns out to be singular."

comment:30 Changed 15 months ago by git

  • Commit changed from 99c2a7e7046c5e621198a4b71d3cd8f67e29a22a to c1178d91bb4e10fa3fd53ea6a50f9a93f356c78d

Branch pushed to git repo; I updated commit sha1. New commits:

c1178d912406: add NotFullRankError for _solve_right_nonsingular_square

comment:31 Changed 15 months ago by gh-mwageringel

That is a good idea. I have implemented it.

comment:32 Changed 15 months ago by mjo

  • Status changed from needs_review to positive_review

Let's get this over with =)

The only other thing I see is in the new test,

Over inexact rings, the ``check`` parameter is ignored as the result is
only an approximate solution (:trac:`13932`)::

sage: RF = RealField(52)
sage: B = matrix(RF, 2, 2, 1)
sage: A = matrix(RF, [[0.24, 1, 0], [1, 0, 0]])
sage: (A * A.solve_right(B) - B).norm() < 1e-14
True

to ensure that check=True is ignored, we also want that small norm to not actually be zero. If you feel like adding that, just go ahead and set it back to positive review without the round-trip.

Thanks for sticking with this. I've been complaining about issues like this since I was an undergrad, but I never had the guts to tackle it myself.

Last edited 15 months ago by mjo (previous) (diff)

comment:33 Changed 15 months ago by git

  • Commit changed from c1178d91bb4e10fa3fd53ea6a50f9a93f356c78d to 93cea8547c304a48347f25a4120af4cf71e2a6ef
  • Status changed from positive_review to needs_review

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

93cea8512406: test that the solution is inexact

comment:34 Changed 15 months ago by gh-mwageringel

  • Status changed from needs_review to positive_review

Thank you for the review and all the valuable input you provided.

I have added the non-zeroness check to the doctest. Setting this back to positive.

comment:35 Changed 14 months ago by vbraun

  • Branch changed from u/gh-mwageringel/12406v2 to 93cea8547c304a48347f25a4120af4cf71e2a6ef
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.