Opened 4 years ago

Closed 3 years ago

#21303 closed defect (fixed)

Make hecke operators not blow up the memory

Reported by: kartikv Owned by:
Priority: major Milestone: sage-7.4
Component: modular forms Keywords: eclib, modular symbols, hecke operators
Cc: cremona, kedlaya Merged in:
Authors: Kiran Kedlaya Reviewers: David Roe
Report Upstream: None of the above - read trac for reasoning. Work issues:
Branch: 6c3bac5 (Commits) Commit: 6c3bac591bde14d3bd23bb4f983af24a7b8077b6
Dependencies: Stopgaps:

Description (last modified by kartikv)

Constructing Hecke operators acting on modular symbols can blow up the memory usage, though it doesn't have to. The eclib code has sparse and dense options for generating Hecke operators, but the sage caller only calls the dense options. The simple solution is to modify the sage caller to add a flag which specifies to use the sparse options: perhaps a better solution would be to intuit whether the resulting matrix is likely to be sparse and do something intelligent.

Change History (34)

comment:1 Changed 4 years ago by kartikv

  • Description modified (diff)

comment:2 Changed 4 years ago by kedlaya

The sage_matrix_over_ZZ method of internal Matrix class in the eclib wrapper does have a flag sparse (defaults to False), but at this point it is too late; the Hecke matrix was already created as an eclib dense matrix (i.e., a C++ array of long's). If I understand correctly, what is really needed is a flag in the hecke_matrix method that gets passed to eclib, so that the underlying C++ code switches correctly.

comment:3 Changed 4 years ago by cremona

You will also need to write a cython function to convert an eclib sparse matrix to a Sage one. William and I did this for dense matrices in about 2008. The eclib sparese matrix class is written at rather a low level for efficiency with arrays of pointers etc. Ask me if it is not clear how to extract the information from one.

The output function at lne 729 of https://github.com/JohnCremona/eclib/blob/master/libsrc/smat.cc might help; also the associated header file. I suggest that you loop through all the rows and for each one fetch all the (column, value) pairs. One thing to watch out for: if a row contains n nonzero entries then the 0'th entry in the associated col array stores n so that array has size n+1 while the values array has size n. Row and column numbers start from 1 not 0. Good luck.

comment:4 Changed 4 years ago by kedlaya

  • Branch set to u/kedlaya/make_hecke_operators_not_blow_up_the_memory
  • Commit set to bcd410a685a80decf97914b056c1058747ee785a
  • Status changed from new to needs_review

I got a rather crude version of this up and running, so here's a patch. The crude part is that I ask eclib for dense vectors one at a time, rather than asking for a whole sparse matrix; but this was easier to wrap (I didn't need to tamper with the include files to expose any functionality) and already avoids an obvious quadratic memory step.

I tried this out on my usual compute server:

sage: time C = CremonaModularSymbols(400001, sign=-1)
CPU times: user 26.6 s, sys: 119 ms, total: 26.7 s
Wall time: 26.7 s
sage: time T2a = C.hecke_matrix(2)
CPU times: user 17.7 s, sys: 4.9 s, total: 22.6 s
Wall time: 22.6 s
sage: time T2b = T2a.sage_matrix_over_ZZ(sparse=True)
CPU times: user 26.2 s, sys: 83 ms, total: 26.2 s
Wall time: 26.3 s
sage: del T2a
sage: time T2c = C.sparse_hecke_matrix(2)
CPU times: user 1min 10s, sys: 548 ms, total: 1min 11s
Wall time: 1min 11s
sage: time T2b == T2c # True but not instant for sparse matrices
CPU times: user 11 s, sys: 650 ms, total: 11.7 s
Wall time: 11.7 s
True

The regression (48.9s to 71.7s) is annoying; the bottleneck is the line

        ans = M(entries=d)

where M is the matrix space of sparse matrices and d is the dict mapping coordinate pairs to matrix entries. I had the thought that the matrix constructor must be copying the dict (as is necessary generically) even though in this case it's not required, but I don't think that's the issue:

sage: time dd = copy(d) #too short
CPU times: user 960 ms, sys: 73 ms, total: 1.03 s
Wall time: 1.03 s
sage: time dd = deepcopy(d) #too long
CPU times: user 2min 21s, sys: 1.7 s, total: 2min 23s
Wall time: 2min 23s

On the memory side (crudely measured by watching top), the computation of T2b via T2a peaks at 5.9g whereas the computation of T2c peaks at 1.6g, so at least that went according to plan.

And in my intended use case I work modulo some prime, and this is a big win:

sage: time T2d = C.sparse_hecke_matrix(2, base_ring=GF(2))
CPU times: user 15.3 s, sys: 317 ms, total: 15.7 s
Wall time: 15.7 s
sage: time T2e = C.sparse_hecke_matrix(2, base_ring=GF(next_prime(2^30)))
CPU times: user 24.3 s, sys: 607 ms, total: 25 s
Wall time: 25 s
}

so I'm taken care of.


New commits:

8075a0cGenerate sparse Hecke matrix without dense intermediary
0bcc834Add base_ring option to sparse_hecke_matrix, use sparse constructor
bd5a8c5Docstring corrections
bcd410aMore minor edits

comment:5 Changed 4 years ago by kedlaya

I should add that I have been running this code since I posted it and am observing a memory leak somewhere. Specifically, I've been running this code snippet (for various values of i):

def zero_eigenvalue_multiplicity(m): #taken from trac #20788
    n = 0
    m1 = m
    r = m.dimensions()[0]
    while True:
        m2 = m1.extended_echelon_form(subdivide=True)
        t = r - m2.subdivisions()[0][0]
        n += t
        if t == 0 or t == r: return n
        m3 = m2.subdivision(0,0)
        m4 = m2.submatrix(0,r,r,r)
        m5 = m3 * m4.inverse()
        m1 = m5.submatrix(0,0,r-t,r-t)
        r -= t

for n in range((i-5)*1000+1, i*1000, 2):
    C = CremonaModularSymbols(n, cuspidal=True, sign=-1)
    T2 = C.sparse_hecke_matrix(2, base_ring=GF(2)).dense_matrix()
    t = (n, zero_eigenvalue_multiplicity(T2), zero_eigenvalue_multiplicity(T2+1))

and observing (via top) memory usage growing from 200-300mb for the initial stages to over 10gb later in the loop. My guess is that there is no serious memory leak in eclib, and I wouldn't suspect one in m4ri either; but I'm having trouble coming up with other options, and I'm not up to speed with valgrind enough to try checking that way.

comment:6 Changed 4 years ago by cremona

When eclib was first put into Sage back in about 2007 it was subjected to rigorous valgrinding by Michael Abshoff, which revealed several issues which were then fixed (with some effort!). But there has been considerable change since then, so I should do that again.

I have made this an Issue for eclib: https://github.com/JohnCremona/eclib/issues/18

comment:7 Changed 4 years ago by kedlaya

Update: I can reproduce the memory leak behavior by simply repeating the code from #20788 (as reported therein). That is not to say that running valgrind again on eclib wouldn't be advisable, but I am not specifically claiming a memory leak in the computation of Hecke matrices (though if you do find one, it would certainly help me for it to be fixed).

comment:8 Changed 4 years ago by kedlaya

Oh, now this is bad news.

sage: C = CremonaModularSymbols(45, cuspidal=True,sign=-1)
sage: T2a = C.hecke_matrix(2).sage_matrix_over_ZZ()
sage: print T2a
[-1 -1 -1]
[ 0  1  2]
[ 0  0 -1] 
sage: T2b = C.sparse_hecke_matrix(2)
sage: print T2b
[-1 -1  0]
[ 0  1  2]
[ 0  0 -3]

This is only an issue when cuspidal=True: the eclib function heckeop takes the projection onto the cuspidal subspace into account but heckeop_col does not.

It looks like s_heckeop does account for the cuspidal projection, so switching to that should fix this problem.

comment:9 Changed 4 years ago by cremona

Apologies for the inconsistencies. There must be quite a few methods in this homspace class which are no longer used in the main programs and whose implementations have therefore not been kept consistent or tested. Unlike Sage where every single method has its own test, I just have high-level test programs and there must be a lot of code which is no longer tested, thought of by me as obsolete but not actually deleted. The number of people who have looked at this code in any detail is very small!

comment:10 Changed 4 years ago by git

  • Commit changed from bcd410a685a80decf97914b056c1058747ee785a to 6fba176c039f8f1f7d9ca8a011f5ede9eed017c0

Branch pushed to git repo; I updated commit sha1. New commits:

6fba176Use sparse Hecke matrix from eclib, correct results for cuspidal subspace

comment:11 Changed 4 years ago by kedlaya

I went ahead and switched over to s_heckeop and this indeed resolves the issue. I am still doing the silly thing of converting a sparse vector into a dense vector before transferring it into Python; that's because I'm not familiar enough with the C++-Python interaction to know how to correctly wrap the functions that provide direct access to the underlying dict-like structure of a sparse vector.

Timings are similar, but now on a completely unloaded machine, so more indicative than before:

sage: sage: time C = CremonaModularSymbols(400001, sign=-1)
CPU times: user 25.4 s, sys: 95 ms, total: 25.5 s
Wall time: 25.5 s
sage: time T2a = C.hecke_matrix(2)
CPU times: user 13.7 s, sys: 4.08 s, total: 17.8 s
Wall time: 17.8 s
sage: time T2b = T2a.sage_matrix_over_ZZ(sparse=True)
CPU times: user 25 s, sys: 59 ms, total: 25.1 s
Wall time: 25.1 s
sage: time T2c = C.sparse_hecke_matrix(2)
CPU times: user 1min 2s, sys: 476 ms, total: 1min 3s
Wall time: 1min 3s
sage: time T2b == T2c
CPU times: user 9.48 s, sys: 633 ms, total: 10.1 s
Wall time: 10.1 s
True
sage: time T2c = C.sparse_hecke_matrix(2, base_ring=GF(2))
CPU times: user 17.4 s, sys: 328 ms, total: 17.7 s
Wall time: 17.7 s

comment:12 Changed 3 years ago by kedlaya

  • Authors set to Kiran Kedlaya

comment:13 Changed 3 years ago by kedlaya

Note to myself: check to see whether the resolution of #22164 has had any effect on the memory leak reported above.

comment:14 Changed 3 years ago by git

  • Commit changed from 6fba176c039f8f1f7d9ca8a011f5ede9eed017c0 to 05e52f4f98b6d186e8934ddcb9ac4e415a7b2aef

Branch pushed to git repo; I updated commit sha1. New commits:

0c0025aMerge branch 'u/kedlaya/make_hecke_operators_not_blow_up_the_memory' of git://trac.sagemath.org/sage into 21303
05e52f4Fix doctests for py3 print

comment:15 Changed 3 years ago by jdemeyer

  • Status changed from needs_review to needs_work

This syntax is deprecated and shouldn't be used anymore:

for j from 1 <= j <= n:

Cython will generate efficient code for range() if the variables are of some C type (which is the case here since you declared cdef long j).

comment:16 Changed 3 years ago by jdemeyer

You should not put Python code inside sig_on()/sig_off() blocks.

There are two (not mutually exclusive) cases:

(A) If there is any particular non-Python statement that takes a long time or might generate signals, put sig_on()/sig_off() around that particular statement.

(B) If you just want to interrupt the loop, put sig_check() inside that loop.

comment:17 Changed 3 years ago by jdemeyer

In the eclib.pxd file, you have a duplicate

ctypedef int scalar

It should be declared just once.

comment:18 Changed 3 years ago by jdemeyer

(NOTE: I just checked the Cython programming issues, I didn't actually try to understand the code; I assume that John can do that)

comment:19 follow-up: Changed 3 years ago by cremona

I looked at the code. Congratulations for being the first person in history other than me and Luis Figueiredo (a student of Taylor in the 1990s) to read my sparse vector / matrix code!

What you do is go through the rows, extract the i'th row as a sparse vector, convert it to a dense vector, and then go through its entries storing the nonzero ones in the new Sage sparse matrix. This is inefficient as you look at all m*n netries (if the matrix is mxn). It would be better to extract just the nonzero entries in the first place. I can try to write that, or help someone else, but before do I need to know if Cython will have access to the "protected" (in C++ terms) i.e. semi-private data components of the C++ sparse matrix. If so I can write the necessary lines.

Last edited 3 years ago by jdemeyer (previous) (diff)

comment:20 in reply to: ↑ 19 Changed 3 years ago by jdemeyer

Replying to cremona:

I need to know if Cython will have access to the "protected" (in C++ terms) i.e. semi-private data components of the C++ sparse matrix. If so I can write the necessary lines.

Cython essentially just generates C or C++ code. So it won't have any more access than user-written C++ code. So that means that it cannot access arbitrary protected members.

comment:21 Changed 3 years ago by cremona

That makes it harder. The thing is an array of "rows" (one for each actual row even if the row is zero). We know how many from M.nrows(). For the i'th row we store the number of nonzero entries, the list of columns they are in and the list of the entries. The only difficulty is in getting the 0/1 rebasing correct since C, like Python goes from 0 bu my user interface goes from 1...

I can see that to provide the interface needed will require adding to eclib. Sorry. Meanwhile it would still be simpler to extract just the nonzero entries from the sparse vector which we already extract for each row. But even that requires access to protected members of the svec class (which are essentially dicts containing (index, netry) pairs). So that leaves this:

        for i in range(n):
            for j in range(n):
                Mij = M.elem(i+1,j+1)
                if Mij:
                    d[(i, j)] = Mij

I don't expect that to be any better than the current version though since for each i,j the elem() method will do a linear search along the i'th row to see if the j'th entry is present.

Sorry not to have a low-level interface!

comment:22 Changed 3 years ago by cremona

A possible low-level interface could work like a Python iterator delivering triple (i,j,M[i,j]) with some signal at the end such as (0,0,0). Without doing a lot of work that could easily be provided as a single 3xN matrix where N is one more than the number of nonzero entries. Then the user can loop over that. But such a change to eclib is for the future, and should not delay this.

comment:23 Changed 3 years ago by git

  • Commit changed from 05e52f4f98b6d186e8934ddcb9ac4e415a7b2aef to 68ba41cd16b2e8f0becbdb86c905230750f0a896

Branch pushed to git repo; I updated commit sha1. New commits:

68ba41cCorrected some Cython issues

comment:24 follow-up: Changed 3 years ago by kedlaya

  • Report Upstream changed from Not yet reported upstream; Will do shortly. to None of the above - read trac for reasoning.
  • Status changed from needs_work to needs_review

I fixed the Cython issues (I think).

That said, svec.h does appear to include some access functions:

  // functions to enable iteration over the entries without direct
  // access to the entries map:
  map<int,scalar>::const_iterator begin() const {return entries.begin();}
  map<int,scalar>::const_iterator end() const {return entries.end();}
  map<int,scalar>::iterator begin() {return entries.begin();}
  map<int,scalar>::iterator end() {return entries.end();}
  void erase(int i);  // erases v[i]; error if not set
  int first_index() const {return entries.upper_bound(0)->first;}
  std::set<int> support() const;

that should avoid creating a dense vector; but I need to learn a bit more about Cython to figure out how to exploit those.

comment:25 in reply to: ↑ 24 Changed 3 years ago by jdemeyer

Replying to kedlaya:

I need to learn a bit more about Cython to figure out how to exploit those.

Cython does support that (I'm assuming that map means std::map):

from libcpp.map cimport map

cdef extern from "...":
    cdef cppclass ...:
        map[int, scalar].iterator begin()
        map[int, scalar].iterator end()

comment:26 Changed 3 years ago by git

  • Commit changed from 68ba41cd16b2e8f0becbdb86c905230750f0a896 to 981edb0be237a3a3cc90fa3384d3944ae4755d29

Branch pushed to git repo; I updated commit sha1. New commits:

981edb0Attempt to directly access svec

comment:27 Changed 3 years ago by kedlaya

  • Status changed from needs_review to needs_work

I tried to use the iterator following a model in the Cython documentation, but...:

[sagelib-8.1.beta6] ...
[sagelib-8.1.beta6]         cdef smat M = self.H.s_heckeop(p, dual, verbose)
[sagelib-8.1.beta6]         sig_off()
[sagelib-8.1.beta6]         for i in range(n):
[sagelib-8.1.beta6]             sv = M.row(i+1)
[sagelib-8.1.beta6]             iter = sv.begin()
[sagelib-8.1.beta6]             while iter != sv.end():
[sagelib-8.1.beta6]                       ^
[sagelib-8.1.beta6] ------------------------------------------------------------
[sagelib-8.1.beta6]
[sagelib-8.1.beta6] sage/libs/eclib/homspace.pyx:284:23: Invalid types for '!=' (<error>, iterator)

comment:28 Changed 3 years ago by jdemeyer

Next time, please quote the complete error message:

[sagelib-8.1.beta7] Error compiling Cython file:
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] ...
[sagelib-8.1.beta7]         cdef long n = self.dimension()
[sagelib-8.1.beta7]         cdef long i=0
[sagelib-8.1.beta7]         cdef long j=0
[sagelib-8.1.beta7]         cdef vec v
[sagelib-8.1.beta7]         cdef svec sv
[sagelib-8.1.beta7]         cdef map[int, scalar].iterator iter
[sagelib-8.1.beta7]             ^
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] 
[sagelib-8.1.beta7] sage/libs/eclib/homspace.pyx:276:13: 'map' is not a type identifier
[sagelib-8.1.beta7] 
[sagelib-8.1.beta7] Error compiling Cython file:
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] ...
[sagelib-8.1.beta7]         cdef smat M = self.H.s_heckeop(p, dual, verbose)
[sagelib-8.1.beta7]         sig_off()
[sagelib-8.1.beta7]         for i in range(n):
[sagelib-8.1.beta7]             sv = M.row(i+1)
[sagelib-8.1.beta7]             iter = sv.begin()
[sagelib-8.1.beta7]             while iter != sv.end():
[sagelib-8.1.beta7]                       ^
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] 
[sagelib-8.1.beta7] sage/libs/eclib/homspace.pyx:284:23: Invalid types for '!=' (<error>, iterator)

You forgot from libcpp.map cimport map

comment:29 Changed 3 years ago by jdemeyer

In fact, Cython does type inference, so you could also just remove the line

cdef map[int, scalar].iterator iter

comment:30 Changed 3 years ago by git

  • Commit changed from 981edb0be237a3a3cc90fa3384d3944ae4755d29 to d257eabe478f2a4ff92a436afe24f5122e8011dd

Branch pushed to git repo; I updated commit sha1. New commits:

35c2ee5Merge branch 'develop' of git://trac.sagemath.org/sage into t/21303/make_hecke_operators_not_blow_up_the_memory
d257eabImplement map iteration, correctly this time

comment:31 Changed 3 years ago by kedlaya

  • Status changed from needs_work to needs_review

OK, I think I now have a working interface that avoids instantiating any dense vectors.

comment:32 Changed 3 years ago by roed

  • Branch changed from u/kedlaya/make_hecke_operators_not_blow_up_the_memory to u/roed/make_hecke_operators_not_blow_up_the_memory

comment:33 Changed 3 years ago by roed

  • Commit changed from d257eabe478f2a4ff92a436afe24f5122e8011dd to 6c3bac591bde14d3bd23bb4f983af24a7b8077b6
  • Reviewers set to David Roe
  • Status changed from needs_review to positive_review

I made some trivial fixes. Looks good to me.


New commits:

6c3bac5Trivial fixes

comment:34 Changed 3 years ago by vbraun

  • Branch changed from u/roed/make_hecke_operators_not_blow_up_the_memory to 6c3bac591bde14d3bd23bb4f983af24a7b8077b6
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.