Opened 9 years ago

Last modified 8 days ago

#13674 positive_review enhancement

Cholesky decomposition for sparse matrices

Reported by: r.gaia.cs Owned by: gh-bollu
Priority: major Milestone: sage-9.5
Component: linear algebra Keywords: matrix, decomposition, cholesky, sparse
Cc: mjo Merged in:
Authors: Siddharth Bhat, Michael Orlitzky Reviewers: Dima Pasechnik
Report Upstream: N/A Work issues:
Branch: u/mjo/ticket/13674 (Commits, GitHub, GitLab) Commit: 954b9ba39b23216f08e4e13190e3762d161e61ec
Dependencies: Stopgaps:

Status badges

Description

The Cholesky decomposition are implemented in the file source:sage/matrix/matrix2.pyx for some subfield of the algebraic numbers.

In this implementation the base ring must be exact or, for numerical work, a matrix with a base ring of RDF or CDF must be used.

For the numerical work it's used the Cholesky decomposition implemented in source:sage/matrix/matrix_double_dense.pyx and because of this a error raised when try to compute the numerical Cholesky decomposition of a sparse matrix.

sage: A = matrix(QQ, [[1, 1], [1, 2]]) 
sage: A.cholesky()                    
[1 0]
[1 1]
sage: A = matrix(QQ, [[1, 1], [1, 2]], sparse=True)
sage: A.cholesky()                                 
[1 0]
[1 1]
sage: A = matrix(RDF, [[1, 1], [1, 2]], sparse=True)  
sage: A = matrix(RDF, [[1, 1], [1, 2]])             
sage: A.cholesky()                                  
[1.0 0.0]
[1.0 1.0]
sage: A = matrix(RDF, [[1, 1], [1, 2]], sparse=True)
sage: A.cholesky()                                  
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/home/raniere/opt/sage/devel/sage-rcm/sage/matrix/<ipython console> in <module>()

/home/raniere/opt/sage/local/lib/python2.7/site-packages/sage/matrix/matrix2.so in sage.matrix.matrix2.Matrix.cholesky (sage/matrix/matrix2.c:47738)()
   9867             if not self.base_ring().is_exact():
   9868                 msg = 'base ring of the matrix must be exact, not {0}'
-> 9869                 raise TypeError(msg.format(self.base_ring()))
   9870             try:
   9871                 posdef = self.is_positive_definite()

TypeError: base ring of the matrix must be exact, not Real Double Field

For this solve this ticket the numerical sparce Cholesky decompostion need to be implemented.

For more information about this topic see https://groups.google.com/forum/?fromgroups=#!topic/sage-support/do55Fayur6U.

Change History (21)

comment:1 Changed 8 years ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:2 Changed 8 years ago by vbraun_spam

  • Milestone changed from sage-6.1 to sage-6.2

comment:3 Changed 8 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:4 Changed 7 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:5 Changed 11 months ago by gh-bollu

  • Owner changed from jason, was to gh-bollu

comment:6 Changed 11 months ago by gh-bollu

  • Authors set to Siddharth Bhat
  • Branch set to u/gh-bollu/jan-24-cholesky-for-sparse-numerical-matrices
  • Commit set to 4351684e89139ada783572c9929aa8a54d1890c0

New commits:

4351684Trac #13674: implement Chokesly factorization for sparse numerical matrices

comment:7 Changed 11 months ago by gh-bollu

  • Status changed from new to needs_review

comment:8 Changed 11 months ago by gh-bollu

Things I don't understand very well:

  • What's the correct solution for dense matrices? while the above code works, there should be a fast way to supply an entire dense matrix to cvxopt? The best idea I had was to use cvxopt.matrix(sagemat.to_numpy()); perhaps there are faster methods I am unaware of?
  • Similarly, when building the sparse matrix, I coerce the matrix entries into a float. What is the correct way to hand the matrix entries off to cvxopt?
  • On the google groups thread(https://groups.google.com/g/cvxopt/c/xQ-lR9ESijg/discussion), there was some discussion about also getting a good permutation for cholesky. I'd like to submit this patch next once I figure the API out. Any pointers to this would be appreciated!

comment:9 Changed 10 months ago by dimpase

  • Milestone changed from sage-6.4 to sage-9.3
  • Reviewers set to Dima Pasechnik

comment:10 Changed 10 months ago by mjo

  • Cc mjo added

Ohhkay. So, the title of this ticket is a bit misleading. It's not a cholesky for sparse matrices that's missing, but rather a cholesky decomposition for matrices over inexact rings. For example,

sage: A = matrix(RR, [[1, 1], [1, 2]])                                          
sage: A.cholesky()
...
TypeError: base ring of the matrix must be exact, not Real Field with 53 bits of precision

The use of RealField(100) or any similar ring produces the same result. The problem only appears related to sparse matrices because there is a special case implemented for dense matrices over RDF, but not sparse matrices over RDF. Thus, over RDF, the sparse implementation is indeed just "missing" (it could use the dense implementation without hurting anything).

But regardless, the real problem to be solved here is that we need a numerically-stable cholesky factorization that works for most inexact rings. That implementation can then be used (in the meantime) for both the dense and sparse methods, until someone decides to come along and speed it up in the sparse case. I've essentially already done this on https://trac.sagemath.org/ticket/10332, which adds a numerically-stable block_ldlt() method for all Hermitian matrices. When your matrix is positive-definite, the block-LDLT factorization can be turned into a cholesky factorization after taking the square root of D.

So my suggestion for this ticket is to use block_ldlt() from the other ticket instead of adding a special case that relies on cvxopt. By using block_ldlt(), you allow the cholesky() method to work on fields like RR that cvxopt doesn't know about. It should also simplify the code a little bit, at the price of some administrative overhead: you'll have to,

  1. Rebase your git branch onto u/mjo/ticket/10332
  2. Make this ticket depend on #10332
  3. Update the patch to use block_ldlt() instead of calling cvxopt.

I think (3) is pretty self-explanatory after reading the docs for block_ldlt(), but if not, I'm happy to help. E.g.

sage: A = matrix(QQ, [[9, 0, 3], [0, 1, 0], [3, 0, 25/4]])                      
sage: A.is_positive_definite()                                                  
True
sage: P,L,D = A.block_ldlt()                                                    
sage: L*D.apply_map(sqrt)                                                       
[           3            0            0]
[           0            1            0]
[           1            0 1/2*sqrt(21)]
sage: A.cholesky()                                                              
[                 3                  0                  0]
[                 0                  1                  0]
[                 1                  0 2.291287847477920?]

You might be able to save a few microseconds by using the internal _block_ldlt() method instead of the nice public one as well.

comment:11 Changed 10 months ago by gh-bollu

Thanks a lot for the link! My only concern is with that of sparseness; the reason I choose to jump hoops with cvxopt is for the performance wins for *large* sparse matrices, that I get from running discrete differential geometry algorithms such as geodesics in heat: https://arxiv.org/abs/1204.6216

I'll be glad to benchmark the code, and see which method is faster; I think the route you propose is performant for dense inexact matrices for sure. I'm unsure about the sparse case. Could you please shed some light on the performance characteristics of the linked implementation?

comment:12 Changed 10 months ago by mjo

block_ldlt() will be pretty slow on large sparse matrices since the implementation scans the matrix (without regard for sparsity) looking for pivots. It's main benefit (with respect to cholmod) is that it would work for other fields, like sparse matrices over RR.

If you need the performance for sparse matrices over RDF then something like cholmod is indeed the way to go. I'm surprised we don't have a special matrix subclass for sparse RDF matrices already; my first attempt would be to override cholesky() there, similar to how cholesky() is overridden in matrix_double_dense.pyx for dense matrices.

comment:13 follow-ups: Changed 10 months ago by dimpase

Indeed, for sparse matrices over RDF the cholmod interface of cvxopt is probably as fast as it can get (precision is of course another question).

In general, I'd be looking at wrapping up arb's cholesky and LDLs, as something more robust than a naive attempt to implement this stuff.

comment:14 in reply to: ↑ 13 Changed 10 months ago by mjo

Now that I know you want it to be fast (as opposed to "just not crash"), here's my revised advice:

  1. Add a new subclass for sparse real double matrices in sage/matrix/matrix_real_double_sparse.pyx, and override only the cholesky() method there, with an implementation that uses cholmod.
  2. Add a case to src/sage/matrix/matrix_space.py that tells the matrix constructor to use your new subclass when the ring is RDF and sparse=True. Something like,
    if R is sage.rings.real_double.RDF:
        from . import matrix_real_double_sparse
        return matrix_real_double_sparse.Matrix_real_double_sparse
    
  3. I'll handle the other inexact rings whenever #10332 gets merged with a generic slow implementation based on block_ldlt().

The subclass approach will avoid using cholmod where it's inappropriate. If you're not familiar with cython, the syntax isn't that much different than plain python. Here's an example pyx and pxd file that creates a subclass.

src/sage/matrix/matrix_real_double_sparse.pxd:

from .matrix_generic_sparse cimport Matrix_generic_sparse

cdef class Matrix_real_double_sparse(Matrix_generic_sparse):
    pass

src/sage/matrix/matrix_real_double_sparse.pyx:

from .matrix_generic_sparse cimport Matrix_generic_sparse

cdef class Matrix_real_double_sparse(Matrix_generic_sparse):
    def cholesky(self):
        print("DEBUG: subclass method")
        return None

comment:15 in reply to: ↑ 13 Changed 10 months ago by mjo

Replying to dimpase:

In general, I'd be looking at wrapping up arb's cholesky and LDLs, as something more robust than a naive attempt to implement this stuff.

For things like cholesky over RR this would likely be better than a naive implementation based on block_ldlt(), but the big problem I was solving by reimplementing block-LDLT was to gain a factorization that works on indefinite matrices.

comment:16 Changed 9 months ago by mkoeppe

  • Milestone changed from sage-9.3 to sage-9.4

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

comment:17 Changed 7 months ago by mjo

Ticket #31619 allows cholesky() and is_positive_definite() to work over inexact rings as promised, but it will be slower than necessary for sparse RDF matrices. A matrix subclass that delegates to cholmod() in that case is the last missing piece.

comment:18 Changed 5 months ago by mkoeppe

  • Milestone changed from sage-9.4 to sage-9.5

Setting a new milestone for this ticket based on a cursory review.

comment:19 Changed 2 weeks ago by mjo

  • Status changed from needs_review to needs_work

cvxopt is now a pseudo-optional package, but we should still be able to use the approach in comment:14. We can super() if cvxopt isn't available.

comment:20 Changed 13 days ago by mjo

  • Authors changed from Siddharth Bhat to Siddharth Bhat, Michael Orlitzky
  • Branch changed from u/gh-bollu/jan-24-cholesky-for-sparse-numerical-matrices to u/mjo/ticket/13674
  • Commit changed from 4351684e89139ada783572c9929aa8a54d1890c0 to 954b9ba39b23216f08e4e13190e3762d161e61ec
  • Status changed from needs_work to needs_review

This should work for both RDF and CDF, but it would be nice to have some more serious examples for test cases. The cvxopt interface is a little sketchy so I'd like to be sure that we're using it "correctly," insofar as is possible for an undocumented interface.

comment:21 Changed 8 days ago by dimpase

  • Status changed from needs_review to positive_review

lgtm

Note: See TracTickets for help on using tickets.