Opened 16 months ago

Last modified 2 months ago

#29226 new task

Meta-ticket: issues with division of matrices, solve_left, solve_right

Reported by: gh-mwageringel Owned by:
Priority: major Milestone: sage-9.4
Component: linear algebra Keywords:
Cc: mjo Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by gh-mwageringel)

This ticket collects several issues related to the division operation of matrices.

  • #29257 __truediv__ does not work for matrices of different dimensions:
    sage: set_random_seed(1)
    sage: B, A = matrix.random(QQ, 2, 4), matrix.random(QQ, 4, 4)
    sage: B * ~A;    # works
    sage: B / A      # fails
    ...
    TypeError: unsupported operand parent(s) for /: 'Full MatrixSpace of 2 by 4 dense matrices over Rational Field' an
    d 'Full MatrixSpace of 4 by 4 dense matrices over Rational Field'
    
    The corresponding _backslash_ operation usually works.
  • #29257 For two square matrices, the default __truediv__ implementation left * ~right computes the inverse of a matrix. This is unsuitable for numerical computations at least. A better implementation would invoke solve_left for matrices and vectors (_backslash_ already calls solve_right).
  • #12406 solve_left and solve_right should use coercion to find compatible parents
  • #12406 solve_right and solve_left over RDF/CDF only work for square matrices, even though generic matrices support non-square matrices:
    sage: set_random_seed(0)
    sage: A, B = matrix.random(QQ, 4, 3), matrix.random(QQ, 4, 2)
    sage: A.solve_right(B);                                   # works
    sage: A.change_ring(RDF).solve_right(B.change_ring(RDF))  # fails
    ...
    NotImplementedError: coefficient matrix of a system over RDF/CDF must be square, not 4 x 3
    
    This could be solved by using scipy.linalg.lstsq to find a least-squares solution.
  • #17405 solve_right and solve_left over RDF/CDF fail with matrix right hand side
  • #13932 solve_right and solve_left fail with floating point matrices (due to exact checking of the solution)
  • #28586 solve_right and solve_left fail for matrix_modn_sparse
  • #29729 improve checking the result of solve_right for some inexact rings

Change History (11)

comment:1 Changed 16 months ago by gh-mwageringel

What is a good solution for the first problem? Should there be inverse left/right actions of matrix multiplication, so that the coercion framework can be used to find these actions?

The implementation of __truediv__ needs to be overwritten for matrices in any case in order to make use of solve_left. However, solve_left itself also needs to use some form of coercion to make sure the parents of the arguments are compatible (in #17405, I implemented this in an ad-hoc way which could be copied). I am not sure whether implementing inverse actions would help with this or make things needlessly complicated.

comment:2 Changed 16 months ago by gh-mwageringel

  • Description modified (diff)

The coercion is now implemented in #12406, without the need for inverse actions. The __truediv__ operation is implemented in #29257.

comment:3 Changed 15 months ago by mjo

  • Cc mjo added

While technically we can define "division" to mean right-multiplication-by-inverse in a noncommutative ring in a mathematically consistent way, I'm not sure we should have -- particularly for matrices. I have never written A/B with matrices except by mistake, and I've never seen anyone else do it on purpose either. I would much rather get an error if I ever do it again (especially if their dimensions are mismatched) than silently get a somehow-correct-but-definitely-not-what-I-meant answer back. In other words, I think we should just kill matrix-matrix truediv. Having a pedantically correct definition that does the wrong thing 100% of the time isn't helping anything.

comment:4 Changed 15 months ago by gh-mwageringel

Well, the division operator for matrices has existed in Sage for years, so removing it at this point would be quite unfortunate. In #29257, I am just trying to improve the implementation of this matrix operator, to make it as good as it can be.

Let me explain why this operator is important for matrices. Here is an example of an actual computation I did a few days ago

S \ U.H * A * U / S

where U,S,A are matrices. In Sage, this currently only works if U is square, which is unnecessarily restrictive, so you are forced to write

~S * U.H * A * U * ~S

This comes with its own problems because it is slower and numerically less accurate due to the inversion of a matrix. A better way to compute this would be

S.solve_left(S.solve_right(U.H) * A * U)

Writing it like this is rather backwards and difficult to remember. After #29257, this will be equivalent to the first example, which is a much easier syntax as the binary operators compose nicely.

The slash and backslash operators are very standard notation in Matlab, so many people will be familiar with it. After all, it is Sage's mission to be a viable alternative to Matlab among others. Considering all our work on #12406 to improve the backslash operator, I am a bit surprised by your concerns about the forward slash operator. I have difficulties to see how this is doing the wrong thing 100% of the time.

As for the numerical issues, here is an example from the Matlab documentation that shows that using the slash operator to solve A*x = b is typically faster and more accurate than computing the matrix inverse:

sage: import numpy
sage: n = 500
sage: Q = matrix.random(RDF, n).QR()[0]
sage: d = numpy.logspace(0, -10, n)
sage: x = matrix.random(RDF, n, 1)
sage: A = Q * matrix.diagonal(CDF, d) * Q.H
sage: b = A * x
sage: y = ~A * b          # first solution
sage: z = A \ b           # second solution

sage: %timeit y = ~A * b  # slower
10 loops, best of 5: 29.6 ms per loop

sage: %timeit z = A \ b   # faster
100 loops, best of 5: 13.7 ms per loop

sage: (A * y - b).norm()  # large error
1.2981997570563244e-07

sage: (A * z - b).norm()  # small error
1.7440675515723775e-15

comment:5 Changed 15 months ago by mjo

I suspect it's a religious matter, don't take me too seriously. Of your two examples,

S \ U.H * A * U / S
S.solve_left(S.solve_right(U.H) * A * U)

I would greatly prefer the second, because I can tell what it does. The first depends on me knowing what matrix division does, how it associates, and how it's implemented. Whereas the second example is clearly just solving two systems, and I can tell you which two systems those are right away.

What it comes down to for me is that, unless you're a heavy MATLAB user, division of matrices is basically a nonsense operation. I would never think to divide two matrices, so if A / B appears in my own code, it's a mistake. And I'm just guessing based on personal experience (again, to be taken with a grain of salt) that most users of Sage are the same.

This is reminiscent of the Foldable/Travserable? proposal in Haskell. If you open up a Haskell prompt and type length (1,2), you'll get back 1 as the answer. Why? Becuase if you invest 20 years of education and ultimately wind up studying category theory, you'll see that there's a perfectly canonical way to define "fold" on the category of tuples and if you implement length in a perfectly canonical way, well, 1 is the only possible length that (1,2) could have! This is of course ridiculous: the job of the compiler is to prevent me from writing length (1,2) when I mean length [1,2], and enabling length to work on a data type that it's not going to act sanely on prevents it from doing so. I see matrix division the same way.

The fact that MATLAB implements it is a good argument, but MATLAB code is also impossible to debug for this same reason. If my algorithm has a bug and tries to multiply a PNG image by a matrix of strings, MATLAB might silently return a column vector of TCP/IP packets. Tiny errors thus propagate, and make it impossible to localize the cause of a bug.

Sage (via python) isn't even statically typed, so it's easier to make the argument here than in Haskell that if the data type supports __truediv__, then we should use it. It's also very possible that everyone else just plain disagrees with me, and that's fine too. I'm just throwing in my two cents: it's easier for me to write big programs if the compiler keeps me from doing things I don't mean to do. And for me, dividing by a matrix is something I never mean to do.

comment:6 Changed 13 months ago by mkoeppe

  • Milestone changed from sage-9.1 to sage-9.2

comment:7 Changed 13 months ago by gh-mwageringel

  • Description modified (diff)

comment:8 Changed 10 months ago by gh-kliem

@mjo: I positively reviewed #29257, because I think it is an improvement in making right and left division behave consistently. The use of left division as an alias for solve right is even advertised in https://doc.sagemath.org/html/en/tutorial/tour_linalg.html.

Now, making them behave consistently can also mean that they are both undefined or both raise a warning. Maybe a longterm warning as "it is assumed that you meant to use solve left; it is recommended to use solve_left" would work, because it helps users to find the correct things coming from MATLAB. Whoever wants to, can just use it, but we would consider it unclean and not accept it e.g. for sage's code.

comment:9 Changed 10 months ago by mjo

(I don't really expect to win this argument.)

Having left- and right-division act consistently does makes sense if you think of the backslash operator as division-with-its-arguments-reversed. But backslash has two killer features: it already doesn't work in library code, and you can't use it to divide scalars, so there's no risk of concealing a type error.

comment:10 Changed 8 months ago by mkoeppe

  • Milestone changed from sage-9.2 to sage-9.3

comment:11 Changed 2 months ago by mkoeppe

  • Milestone changed from sage-9.3 to sage-9.4
Note: See TracTickets for help on using tickets.