Opened 10 years ago

Closed 7 years ago

#4260 closed enhancement (fixed)

use LinBox as native matrix representation for dense matrices over GF(p)

Reported by: malb Owned by: cpernet
Priority: major Milestone: sage-4.8
Component: linear algebra Keywords: linbox, linear algebra, sd32, sd34
Cc: SimonKing, rbeezer, drkirkby Merged in: sage-4.8.alpha3
Authors: Burcin Erocal, Martin Albrecht, Rob Beezer Reviewers: Burcin Erocal, Simon King, Martin Albrecht, Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by malb)

Copying to and from LinBox? uses up precious RAM and the point of fast linear algebra is to deal with large matrices. We should consider switching to LinBox? as the native representation of matrices over GF(p)

Without Patch

sage: A = random_matrix(GF(97),2000,2000)
sage: %time A*A
CPU times: user 9.66 s, sys: 0.12 s, total: 9.77 s
Wall time: 9.82 s

With Patch

sage: A = random_matrix(GF(97),2000,2000)
sage: %time A*A
CPU times: user 1.32 s, sys: 0.00 s, total: 1.32 s
Wall time: 1.35 s

Magma

> A:=RandomMatrix(GF(97),2000,2000);
> time C:=A*A;                      
Time: 1.560

Attachments (7)

trac_4260-linbox_default.patch (1.9 KB) - added by malb 8 years ago.
make matrix space constructor use the new classes
trac_4260-dense_ctypes_template.patch (122.7 KB) - added by malb 8 years ago.
add templated classes for float and double representation
trac_4260-matrix-modn-docs.patch (9.4 KB) - added by malb 8 years ago.
trac_4260_more_doctests.patch (90.4 KB) - added by malb 8 years ago.
trac_4260_echelonformdomain.patch (7.1 KB) - added by malb 7 years ago.
trac_4260-minor_fixes.patch (4.6 KB) - added by burcin 7 years ago.
minor fixes
trac_4260_bugfix.patch (1.3 KB) - added by malb 7 years ago.

Download all attachments as: .zip

Change History (67)

comment:1 Changed 10 years ago by cpernet

  • Owner changed from was to cpernet
  • Status changed from new to assigned

I will work on it as a coding sprint at SD10.

comment:2 Changed 8 years ago by SimonKing

  • Cc SimonKing added
  • Report Upstream set to N/A

comment:3 Changed 8 years ago by burcin

  • Authors set to Burcin Erocal

I finally rebased the patch from SD16. The template class in the patch contains the updates made to the modn_dense class since then, like changes to the sig_* functions. Apparently the modn_dense class representation now allows permuting the rows by changing pointers in the _matrix array. We can't allow that if we want to pass the _entries to linbox, so I skipped those changes.

Sage builds with the attached patches, and you can construct matrices. However, there are lots of bugs, some linbox wrappers are still stubs, etc. Expect crashes and wrong results.

With the patch applied, I get a crash with the following:

sage: a = matrix(GF(97),3,4,range(12))
sage: a.echelonize()
*** glibc detected *** python: free(): invalid next size (fast): 0x000000000270b370 ***
======= Backtrace: =========
<snip>

AFAICT, the new cython code is an exact copy of the wrapper function in linbox-sage.C. Here is what valgrind says:

==3026== Invalid write of size 8
==3026==    at 0x39E49EF1: T.4552 (ffpack_ludivine.inl:420)
==3026==    by 0x39E49AA0: T.4552 (ffpack_ludivine.inl:486)
==3026==    by 0x39E4ABBF: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26M
atrix_modn_dense_template_20_echelonize_linbox(_object*, _object*) (ffpack.h:113
2)
==3026==    by 0x4E74082: PyObject_Call (abstract.c:2492)
==3026==    by 0x39E2CA8A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26M
atrix_modn_dense_template_19echelonize(_object*, _object*, _object*) (matrix_mod
n_dense_double.cpp:8738)
==3026==    by 0x4F17FF9: PyEval_EvalFrameEx (ceval.c:3706)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
==3026==    by 0x4F19DB1: PyEval_EvalCode (ceval.c:522)
==3026==    by 0x4F19083: PyEval_EvalFrameEx (ceval.c:4401)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
<snip lots more Py* lines>
==3026==  Address 0x6ca16e8 is 0 bytes after a block of size 24 alloc'd
==3026==    at 0x4C267CE: malloc (vg_replace_malloc.c:236)
==3026==    by 0x39E4AA5A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26Matrix_modn_dense_template_20_echelonize_linbox(_object*, _object*) (memory.h:32)
==3026==    by 0x4E74082: PyObject_Call (abstract.c:2492)
==3026==    by 0x39E2CA8A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26Matrix_modn_dense_template_19echelonize(_object*, _object*, _object*) (matrix_modn_dense_double.cpp:8738)
==3026==    by 0x4F17FF9: PyEval_EvalFrameEx (ceval.c:3706)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
==3026==    by 0x4F19DB1: PyEval_EvalCode (ceval.c:522)
==3026==    by 0x4F19083: PyEval_EvalFrameEx (ceval.c:4401)
==3026==    by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968)
==3026==    by 0x4F18074: PyEval_EvalFrameEx (ceval.c:3802)
<snip lots of Py* lines>

I'd appreciate any pointers about the problem above, though I don't know if I'll have the time to come back to this before the bug days in August (when I presume Martin will take over?).

comment:4 Changed 8 years ago by rbeezer

These are the files in sage/matrix with failures:

        sage -t  devel/sage-main/sage/matrix/matrix_cyclo_dense.pyx # 22 doctests failed
        sage -t  devel/sage-main/sage/matrix/strassen.pyx # 2 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix0.pyx # 2 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_integer_dense.pyx # 5 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_space.py # 1 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_window_modn_dense.pyx # 1 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_modn_sparse.pyx # 1 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_integer_dense_saturation.py # 0 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix_rational_dense.pyx # 44 doctests failed
        sage -t  devel/sage-main/sage/matrix/matrix2.pyx # Time out
        sage -t  devel/sage-main/sage/matrix/matrix_modn_dense.pyx # Time out
        sage -t  devel/sage-main/sage/matrix/matrix_modn_dense_template.pxi # Time out

Changed 8 years ago by malb

make matrix space constructor use the new classes

comment:5 Changed 8 years ago by malb

  • Cc rbeezer added
  • Description modified (diff)

I fixed a few issues and segfaults but the thing is far from done. However, one can probably do higher level stuff now, i.e. it shouldn't crash that much any more.

We need a new LinBox SPKG because Modular<float> didn't have a NonZeroRandIter which is needed by the charpoly code. LinBox 1.1.7 fixes this issue but I tried unsuccessfully to upgrade to 1.1.7 for like 10 hours (cf. #11718).

comment:6 Changed 8 years ago by malb

  • Description modified (diff)

comment:7 Changed 8 years ago by malb

Doctest failures with most recent patch on sage.math:

        sage -t  -long -force_lib devel/sage/doc/de/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/doc/en/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/doc/en/bordeaux_2008/modular_forms_and_hecke_operators.rst # 1 doctests failed
        sage -t  -long -force_lib devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed
        sage -t  -long -force_lib devel/sage/doc/fr/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/doc/ru/tutorial/tour_advanced.rst # 2 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/heilbronn.pyx # 2 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/tests.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/space.py # 18 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/eisenstein_submodule.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/tests.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/constructor.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/space.py # 8 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/ambient.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/cuspidal_submodule.py # 6 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modsym/ambient.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/modform/element.py # 11 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/element.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/module.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed
        sage -t  -long -force_lib devel/sage/sage/matrix/matrix_cyclo_dense.pyx # 8 doctests failed
        sage -t  -long -force_lib devel/sage/sage/matrix/matrix2.pyx # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/tests/cmdline.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed
        sage -t  -long -force_lib devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed

comment:8 Changed 8 years ago by malb

  • Authors changed from Burcin Erocal to Burcin Erocal, Martin Albrecht
  • Description modified (diff)

I've fixed all the easy stuff which brings the doctest failures down to:

sage -t  -long devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed
sage -t  -long devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed
sage -t  -long devel/sage/sage/modular/modsym/space.py # 18 doctests failed
sage -t  -long devel/sage/sage/modular/modform/eisenstein_submodule.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/modform/constructor.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/modform/space.py # 8 doctests failed
sage -t  -long devel/sage/sage/modular/modform/ambient.py # 4 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/element.py # 1 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/module.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed
sage -t  -long devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed
sage -t  -long devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed
sage -t  -long devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed

many of which seem to be caused by a small set of bugs.

comment:9 Changed 8 years ago by malb

Here's where we are at on sage.math:

sage -t  devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed
sage -t  devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed
sage -t  devel/sage/sage/modular/modsym/space.py # 12 doctests failed
sage -t  devel/sage/sage/modular/modform/eisenstein_submodule.py # 1 doctests failed
sage -t  devel/sage/sage/modular/modform/space.py # 7 doctests failed
sage -t  devel/sage/sage/modular/modform/constructor.py # 1 doctests failed
sage -t  devel/sage/sage/modular/modform/ambient.py # 4 doctests failed
sage -t  devel/sage/sage/modular/hecke/element.py # 1 doctests failed
sage -t  devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed
sage -t  devel/sage/sage/modular/hecke/module.py # 3 doctests failed
sage -t  devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
sage -t  devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed
sage -t  devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed
sage -t  devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed
sage -t  devel/sage/sage/structure/sage_object.pyx # 1 doctests failed
sage -t  devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed
sage -t  devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed
sage -t  devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed

comment:10 Changed 8 years ago by was

  • Work issues set to sd32

comment:11 Changed 8 years ago by malb

With the updated patch we are down to:

sage -t  -long devel/sage/sage/modular/modsym/heilbronn.pyx # 2 doctests failed
sage -t  -long devel/sage/sage/modular/abvar/homology.py # 3 doctests failed

However, there also seems to be a doctest failure in matrix2.pyx which is not that easily reproduced.

comment:12 Changed 8 years ago by malb

  • Status changed from new to needs_review

Now all doctests should pass!

comment:13 Changed 8 years ago by was

  • Keywords sd32 added
  • Work issues sd32 deleted

comment:14 Changed 8 years ago by malb

  • Authors changed from Burcin Erocal, Martin Albrecht to Burcin Erocal, Martin Albrecht, Rob Beezer
  • Description modified (diff)
  • Work issues set to extend documentation

Changed 8 years ago by malb

add templated classes for float and double representation

comment:15 Changed 8 years ago by malb

I adapted the crossover from float to double. Around 211 Modular<float>} is really slow because there are not enough bits left to let ATLAS do it's magic, i.e., too many modular reductions. On my computer using Modular<float> up to 28 seems like a good choice. On sage.math this choice isn't too bad (but not optimal). Multiplying two 1,000 x 1,000 matrices over GF(p) (2nd column) which is smaller than 2i (1st column) and the time it takes:

 2          3 0.22000
 3          7 0.24000
 4         13 0.24000
 5         31 0.25000
 6         61 0.26000
 7        127 0.26000
 8        251 0.62000
 9        509 0.38000 <=== using Modular<double> now
10       1021 0.38000
11       2039 0.39000
12       4093 0.39000
13       8191 0.40000
14      16381 0.41000
15      32749 0.41000
16      65521 0.42000
17     131071 0.43000
18     262139 0.43000
19     524287 0.44000
20    1048573 0.44000
21    2097143 0.45000
22    4194301 0.66000
23    8388593 1.91000 <=== Generic matrices

comment:16 Changed 8 years ago by SimonKing

I found that the time for computing echelon form became worse:

sage-4.6.2

sage: MS = MatrixSpace(GF(101),2000,2000)
sage: %time A = MS.random_element()
CPU times: user 0.17 s, sys: 0.03 s, total: 0.20 s
Wall time: 0.20 s
sage: B = MS.random_element()
sage: %time C = A*B
CPU times: user 8.35 s, sys: 0.07 s, total: 8.42 s
Wall time: 8.45 s
sage: %time A.echelonize()
CPU times: user 1.22 s, sys: 0.06 s, total: 1.28 s
Wall time: 1.38 s

sage-4.7.2.alpha2 with the patches and spkg from here:

sage: MS = MatrixSpace(GF(101),2000,2000)
sage: %time A = MS.random_element()
CPU times: user 0.19 s, sys: 0.03 s, total: 0.22 s
Wall time: 0.22 s
sage: B = MS.random_element()
sage:  %time C = A*B
CPU times: user 1.16 s, sys: 0.02 s, total: 1.17 s
Wall time: 1.22 s
sage: %time A.echelonize()
CPU times: user 1.87 s, sys: 0.00 s, total: 1.87 s
Wall time: 1.92 s

Changed 8 years ago by malb

Changed 8 years ago by malb

comment:17 follow-up: Changed 8 years ago by malb

  • Description modified (diff)
  • renamed Rob's patch to fix ticket number
  • (hopefully) added doctests for every single function
  • Simon, can you try again after setting MAX_MODULUS in sage.matrix.matrix_modn_dense_float to 26? This forces the use of doubles for GF(101) which might be more efficient. Also, how fast is A.echelonize('gauss') for you on that benchmark?

comment:18 Changed 8 years ago by malb

  • Description modified (diff)

comment:19 in reply to: ↑ 17 Changed 8 years ago by SimonKing

Replying to malb:

  • Simon, can you try again after setting MAX_MODULUS in sage.matrix.matrix_modn_dense_float to 26? This forces the use of doubles for GF(101) which might be more efficient.

It isn't:

sage: sage.matrix.matrix_modn_dense_float.MAX_MODULUS = 2^6
sage: MS = MatrixSpace(GF(101),2000,2000)
sage: %time A = MS.random_element()
CPU times: user 0.21 s, sys: 0.01 s, total: 0.22 s
Wall time: 0.22 s
sage: B = MS.random_element()
sage: %time C = A*B
CPU times: user 1.88 s, sys: 0.04 s, total: 1.92 s
Wall time: 1.93 s
sage: %time A.echelonize()
CPU times: user 2.65 s, sys: 0.00 s, total: 2.65 s
Wall time: 2.69 s
sage: type(A)
<type 'sage.matrix.matrix_modn_dense_double.Matrix_modn_dense_double'>

Also, how fast is A.echelonize('gauss') for you on that benchmark?

You mean "how slow", I suppose:

sage: A = MS.random_element()
sage: %time A.echelonize('gauss')
CPU times: user 41.53 s, sys: 0.10 s, total: 41.63 s
Wall time: 41.75 s

comment:20 Changed 8 years ago by malb

  • Work issues changed from extend documentation to improve echelonize

Okay, so both the old code and this patch call LinBox? but with the patch it's slower (I can reproduce your timing difference). Hence, we'll have to check what LinBox? in the old version ends up doing vs. what it is doing now.

comment:21 Changed 8 years ago by cpernet

A word about the regression (I'm copying my reply to malb on linbox-devel)

The new code (that I wrote )

size_t r = FFPACK::ReducedRowEchelonForm(F, nrows, ncols, matrix, ncols, P,Q);

calls the actual RowEchelon? elimination in FFPACK, which transforms A into its redrowechelon form E and the transformation matrix U (both matrices being magically stored inplace in A)

It is slower than the older code sage-4.6 using linbox-1.1.6:

int rank = EF.rowReducedEchelon(E, A);

The latter computes the redrowechlon (actually the trans of the redcolechelon), but no transformation matrix. This saves roughly 50% of the total number of arithmetic ops (1n3 rather than 2n3), and explains the regression.

Switching back to the old way should fix the regression (for a quick fix). And I still need to add the feature of not computing the transform at the level of FFPACK, since I expect some timing improvements over the old version in linbox 1.1.6.

comment:22 Changed 8 years ago by malb

Hi Clement,

thanks for explaining. I always forget about the transformation matrix (e.g., that in fact M4RI is even faster than Magma than previously thought, because we always compute the transformation matrix and yet we are faster :)). I'll try to "switch back". Btw. is there a way to construct the right matrices without copying?

PS: I didn't get your reply on [linbox-devel] btw.

comment:23 Changed 7 years ago by malb

  • Description modified (diff)

The additional patch makes us use the old EchelonFormDomain? interface which is twice as fast (as Clément explained). Simon, can you review this ticket?

Changed 7 years ago by malb

Changed 7 years ago by burcin

minor fixes

comment:24 Changed 7 years ago by burcin

  • Keywords sd34 added
  • Reviewers set to Burcin Erocal

attachment:trac_4260-minor_fixes.patch

  • makes some cosmetic changes and
  • fixes a possible memory leak if allocation of these matrices fail.

I read through the patches and the resulting code. All looks good to me. Please switch this to positive review if my patch is ok.

Thanks everyone for finally finishing this off.

comment:25 Changed 7 years ago by malb

  • Reviewers changed from Burcin Erocal to Burcin Erocal, Simon King, Martin Albrecht
  • Status changed from needs_review to positive_review

Burcin's patch looks good. Thus, giving it a positive review. I'm also running doctests again against 4.7.2.alpha3.

comment:26 Changed 7 years ago by malb

  • Description modified (diff)

comment:27 Changed 7 years ago by malb

Doctests indeed pass on sage.math.

comment:28 Changed 7 years ago by jdemeyer

  • Milestone changed from sage-4.7.2 to sage-4.7.3
  • Work issues improve echelonize deleted

comment:29 Changed 7 years ago by jdemeyer

  • Status changed from positive_review to needs_work
  • Work issues set to cleanup spkg

The spkg could do with some cleanup:

  1. What is the purpose of spkg-rebuild? If it is not used, remove it. If it is used, document it.
  2. spkg-debian and the dist directory should be removed. They are leftovers for Debian, but these are now being removed from every spkg.
  3. Why is "linbox" in .hgignore?
  4. The file patches/commentator.C lacks a corresponding .patch file.

Optional:

  1. Make spkg-install use patch for patching.

comment:30 Changed 7 years ago by malb

  • Description modified (diff)
  • Status changed from needs_work to needs_review

What is the purpose of spkg-rebuild? If it is not used, remove it. If it is used, document it.

removed

spkg-debian and the dist directory should be removed. They are leftovers for Debian, but these are now being removed from every spkg.

removed

Why is "linbox" in .hgignore?

removed

The file patches/commentator.C lacks a corresponding .patch file.

added

Make spkg-install use patch for patching.

left for another time

comment:31 Changed 7 years ago by jdemeyer

  • Merged in set to sage-4.7.3.alpha0
  • Resolution set to fixed
  • Status changed from needs_review to closed
  • Work issues cleanup spkg deleted

comment:32 Changed 7 years ago by jdemeyer

  • Merged in sage-4.7.3.alpha0 deleted
  • Resolution fixed deleted
  • Status changed from closed to new

Unfortunately, there are failures on OS X 10.4 PPC, all in the file sage/matrix/matrix_modn_dense_double.pyx:

sage -t  -long -force_lib devel/sage/sage/matrix/matrix_modn_dense_double.pyx
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 71:
    sage: A[0,0] = 220082r; A
Expected:
    [ 220082 2824836  765701 2282256]
    [1795330  767112 2967421 1373921]
    [2757699 1142917 2720973 2877160]
    [1674049 1341486 2641133 2173280]
Got:
    [      0 2824836  765701 2282256]
    [1795330  767112 2967421 1373921]
    [2757699 1142917 2720973 2877160]
    [1674049 1341486 2641133 2173280]
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 76:
    sage: a = A[0,0]; a
Expected:
    220082
Got:
    0
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 78:
    sage: ~a
Exception raised:
    Traceback (most recent call last):
      File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_2[5]>", line 1, in <module>
        ~a###line 78:
    sage: ~a
      File "integer_mod.pyx", line 3240, in sage.rings.finite_rings.integer_mod.IntegerMod_int64.__invert__ (sage/rings/finite_rings/integer_mod.c:25534)
      File "integer_mod.pyx", line 3371, in sage.rings.finite_rings.integer_mod.mod_inverse_int64 (sage/rings/finite_rings/integer_mod.c:26331)
    ZeroDivisionError: Inverse does not exist.
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 86:
    sage: A[0,0] = 220082r; A
Expected:
    [ 220082 1237101 2033003 3788106]
    [4649912 1157595 4928315 4382585]
    [4252686  978867 2601478 1759921]
    [1303120 1860486 3405811 2203284]
Got:
    [      0 1237101 2033003 3788106]
    [4649912 1157595 4928315 4382585]
    [4252686  978867 2601478 1759921]
    [1303120 1860486 3405811 2203284]
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 91:
    sage: a = A[0,0]; a
Expected:
    220082
Got:
    0
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 93:
    sage: a*a
Expected:
    4777936
Got:
    0
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 112:
    sage: A[0,0] = K(220082); A
Expected:
    [ 220082 2824836  765701 2282256]
    [1795330  767112 2967421 1373921]
    [2757699 1142917 2720973 2877160]
    [1674049 1341486 2641133 2173280]
Got:
    [      0 2824836  765701 2282256]
    [1795330  767112 2967421 1373921]
    [2757699 1142917 2720973 2877160]
    [1674049 1341486 2641133 2173280]
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 117:
    sage: a = A[0,0]; a
Expected:
    220082
Got:
    0
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 119:
    sage: ~a
Exception raised:
    Traceback (most recent call last):
      File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_3[6]>", line 1, in <module>
        ~a###line 119:
    sage: ~a
      File "integer_mod.pyx", line 3240, in sage.rings.finite_rings.integer_mod.IntegerMod_int64.__invert__ (sage/rings/finite_rings/integer_mod.c:25534)
      File "integer_mod.pyx", line 3371, in sage.rings.finite_rings.integer_mod.mod_inverse_int64 (sage/rings/finite_rings/integer_mod.c:26331)
    ZeroDivisionError: Inverse does not exist.
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 129:
    sage: a = A[0,0]; a
Expected:
    220081
Got:
    0
**********************************************************************
File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 131:
    sage: a*a
Expected:
    4337773
Got:
    0
**********************************************************************

comment:33 Changed 7 years ago by jdemeyer

  • Status changed from new to needs_review

comment:34 Changed 7 years ago by jdemeyer

  • Status changed from needs_review to needs_work

comment:35 follow-ups: Changed 7 years ago by malb

If I understand the errors correctly it's only setting elements which goes wrong, so I assume some cast doesn't work on PPC OSX 10.4. Can I get access to such a machine somehow?

comment:36 in reply to: ↑ 35 Changed 7 years ago by jdemeyer

Replying to malb:

If I understand the errors correctly it's only setting elements which goes wrong, so I assume some cast doesn't work on PPC OSX 10.4. Can I get access to such a machine somehow?

The machine that I'm using for this testing is not mine, but I can ask the owner. I guess it will be okay for him to make an account for you. He is on holidays now (I think up to sunday), so it will take a few days.

comment:37 Changed 7 years ago by malb

Thanks.

comment:38 Changed 7 years ago by jdemeyer

  • Milestone sage-4.7.3 deleted

Milestone sage-4.7.3 deleted

comment:39 in reply to: ↑ 35 Changed 7 years ago by jdemeyer

  • Milestone set to sage-4.8

Replying to malb:

If I understand the errors correctly it's only setting elements which goes wrong, so I assume some cast doesn't work on PPC OSX 10.4. Can I get access to such a machine somehow?

Yes, see private email.

Changed 7 years ago by malb

comment:40 Changed 7 years ago by malb

  • Description modified (diff)
  • Status changed from needs_work to needs_review

The attached patch fixes the issue on the machine in question. We forgot to deal with 32-bit systems in setting elements while we did it for getting elements.

comment:41 Changed 7 years ago by jdemeyer

Why do you write

ceil(sqrt(2^31-1)) < 2^23

It is a true statement, but where does the "23" come from? You could write

ceil(sqrt(2^31-1)) = 46341

comment:42 Changed 7 years ago by malb

223 is the maximum modulus of LinBox?'s double based matrix representation and writing 223 is easier to read than 46241.

comment:43 Changed 7 years ago by jdemeyer

  • Merged in set to sage-4.8.alpha2
  • Resolution set to fixed
  • Reviewers changed from Burcin Erocal, Simon King, Martin Albrecht to Burcin Erocal, Simon King, Martin Albrecht, Jeroen Demeyer
  • Status changed from needs_review to closed

Works on OS X 10.4 PPC, so positive review.

comment:44 Changed 7 years ago by jdemeyer

This crashes all over the place on OpenSolaris? 06.2009-32 (hawk). For example:

sage -t -long  -force_lib devel/sage/sage/rings/qqbar.py
**********************************************************************
File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage-main/sage/rings/qqbar.py", line 241:
    sage: r.imag().minpoly() # this takes a long time (143s on my laptop)
Exception raised:
    Traceback (most recent call last):
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_0[74]>", line 1, in <module>
        r.imag().minpoly() # this takes a long time (143s on my laptop)###line 241:
    sage: r.imag().minpoly() # this takes a long time (143s on my laptop)
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/lib/python/site-packages/sage/rings/qqbar.py", line 2873, in minpoly
        self._minimal_polynomial = self._descr.minpoly()
      File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/lib/python/site-packages/sage/rings/qqbar.py", line 5406, in minpoly
        self._minpoly = self._value.minpoly()
      File "number_field_element.pyx", line 3495, in sage.rings.number_field.number_field_element.NumberFieldElement_absolute.minpoly (sage/rings/number_field/number_field_element.cpp:21939)
      File "number_field_element.pyx", line 3462, in sage.rings.number_field.number_field_element.NumberFieldElement_absolute.charpoly (sage/rings/number_field/number_field_element.cpp:21816)
      File "matrix_rational_dense.pyx", line 936, in sage.matrix.matrix_rational_dense.Matrix_rational_dense.charpoly (sage/matrix/matrix_rational_dense.c:10895)
      File "matrix_integer_dense.pyx", line 1017, in sage.matrix.matrix_integer_dense.Matrix_integer_dense.charpoly (sage/matrix/matrix_integer_dense.c:10961)
      File "matrix_integer_dense.pyx", line 1074, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._charpoly_linbox (sage/matrix/matrix_integer_dense.c:11601)
      File "matrix_integer_dense.pyx", line 1096, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._poly_linbox (sage/matrix/matrix_integer_dense.c:11869)
    RuntimeError: Segmentation fault
**********************************************************************

There are many more like this.

comment:45 follow-up: Changed 7 years ago by malb

Mhh, the trouble is in Matrix_integer_dense, which isn't what this ticket is about, so that's curious. How do I log into hawk?

comment:46 Changed 7 years ago by jdemeyer

I still want to investigate some more, for example I have not checked that it is really this ticket which causes the problems (but you do see "linbox" appearing in the backtrace).

Strangely, even building the documentation crashes:

sphinx-build -b html -d /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/doctrees/en/reference   -A hide_pdf_links=1 /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/en/reference /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference
Running Sphinx v1.1.2
loading pickled environment... not yet created
building [html]: targets for 935 source files that are out of date
updating environment: 935 added, 0 changed, 0 removed
reading sources... [  0%] algebras
reading sources... [  0%] arithgroup
reading sources... [  0%] calculus
reading sources... [  0%] categories
reading sources... [  0%] cmd
reading sources... [  0%] coding
reading sources... [  0%] coercion
reading sources... [  0%] combinat/algebra
reading sources... [  0%] combinat/crystals
[...]
writing additional files... genindex py-modindex search
copying images... [ 16%] sage/graphs/../../media/heawood-graph-latex.png
copying images... [ 33%] sage/homology/../../media/homology/simplices.png
copying images... [ 50%] sage/homology/../../media/homology/torus.png
copying images... [ 66%] sage/homology/../../media/homology/klein.png
copying images... [ 83%] sage/homology/../../media/homology/rp2.png
copying images... [100%] sage/homology/../../media/homology/torus_labelled.png

copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

------------------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
Build finished.  The built documents can be found in /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference

comment:47 Changed 7 years ago by jdemeyer

  • Merged in sage-4.8.alpha2 deleted
  • Resolution fixed deleted
  • Status changed from closed to new

comment:48 in reply to: ↑ 45 Changed 7 years ago by jdemeyer

  • Cc drkirkby added

Replying to malb:

Mhh, the trouble is in Matrix_integer_dense, which isn't what this ticket is about, so that's curious. How do I log into hawk?

Hawk is a machine from David Kirkby, so you should ask him.

comment:49 Changed 7 years ago by malb

Okay, I managed to build 4.8-alpha2 + this ticket on hawk. Just starting and stopping Sage gives:

#0  0xfec9c7fb in _free_unlocked () from /lib/libc.so.1
#1  0xfec9c7af in free () from /lib/libc.so.1
#2  0xfdb81d01 in operator delete (ptr=0x8) at ../../../../gcc-4.5.0/libstdc++-v3/libsupc++/del_op.cc:44
#3  0xfdb81d5d in operator delete[] (ptr=0x8) at ../../../../gcc-4.5.0/libstdc++-v3/libsupc++/del_opv.cc:32
#4  0xfdb72543 in ~ios_base (this=0xf9a3e704) at ../../../../gcc-4.5.0/libstdc++-v3/src/ios.cc:93
#5  0xf9892891 in __static_initialization_and_destruction_0 (__initialize_p=<value optimized out>)
    at /usr/local/gcc-4.5.0/lib/gcc/i386-pc-solaris2.10/4.5.0/../../../../include/c++/4.5.0/bits/basic_ios.h:272
#6  0xf988e3b0 in __do_global_dtors_aux () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0
#7  0xf99f7835 in _fini () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0
#8  0xfefd15fe in call_fini () from /usr/lib/ld.so.1
#9  0xfefd17b3 in atexit_fini () from /usr/lib/ld.so.1
#10 0xfec8370c in _exithandle () from /lib/libc.so.1
#11 0xfec73f52 in exit () from /lib/libc.so.1
#12 0xfeef3232 in Py_Exit (sts=0) at Python/pythonrun.c:1716
#13 0xfeef3357 in handle_system_exit () at Python/pythonrun.c:1116
#14 0x00000000 in ?? ()

So it tries to clean up LinBox? at the end and that's when things go wrong:

_fini () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0

any ideas about why?

comment:50 follow-up: Changed 7 years ago by malb

Weird, I rebuilt everything from scratch using these environment variables

SAGE_PARALLEL_SPKG_BUILD=yes
LD_LIBRARY_PATH=/usr/local/lib
PATH=/usr/local/bins-for-sage:/usr/local/bin:/usr/bin:/bin
MAKE=make -j4

and now

All tests passed!
Total time for all tests: 1742.8 seconds

i.e., the segfault is gone. How does the buildbot build Sage?

comment:51 in reply to: ↑ 50 Changed 7 years ago by jdemeyer

Replying to malb:

i.e., the segfault is gone. How does the buildbot build Sage?

EDITOR=emacs
HISTCONTROL=ignoreboth
HISTSIZE=2000
HOME=/export/home/buildbot
IGNOREEOF=100
LANG=C
LD_LIBRARY_PATH=/usr/local/gcc-4.5.0/lib:/usr/local/gcc-4.5.0/lib/amd64
LESS=iMqR
LESSHISTFILE=-
LOGNAME=buildbot
MAIL=/var/mail/buildbot
MAKE=make -j12
MAKEOPTS=-j12
PAGER=/usr/bin/less
PATH=/export/home/buildbot/bin:/export/home/buildbot/local/hawk/bin:/usr/local/bins-for-sage:/usr/local/gcc-4.5.0/bin:/usr/local/bin:/usr/local/texlive/2010/bin/i386-solaris/:/usr/bin:/usr/sbin
PWD=/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2
SAGE_ATLAS_LIB=/ATLAS32
SAGE_FORTRAN=/usr/local/gcc-4.5.0/bin/gfortran
SAGE_FORTRAN_LIB=/usr/local/gcc-4.5.0/lib/libgfortran.so
SAGE_PARALLEL_SPKG_BUILD=yes
SAGE_PORT=true
SHELL=/bin/bash
SHLVL=1
SSH_CLIENT=128.208.160.197 44994 22
SSH_CONNECTION=128.208.160.197 44994 192.168.1.191 22
SSH_TTY=/dev/pts/2
TERM=screen
TZ=Europe/London
USER=buildbot
VIRTUAL_ENV=/export/home/buildbot/local/hawk
VIRTUAL_ENV_DISABLE_PROMPT=yes
VISUAL=emacs

comment:52 follow-up: Changed 7 years ago by malb

Perhaps, it's a GCC 4.5.0 issue?

comment:53 in reply to: ↑ 52 Changed 7 years ago by jdemeyer

Replying to malb:

Perhaps, it's a GCC 4.5.0 issue?

Could very well be.

What does your gcc --version say? (the gcc you used to compile Linbox successfully)

comment:54 Changed 7 years ago by malb

$ gcc --version
gcc (GCC) 4.4.3 20100112 (prerelease)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

comment:55 follow-up: Changed 7 years ago by malb

I can confirm that this bug is at least triggered by GCC 4.5.

Here's the relevant bits of the env that I used to build Sage + this ticket just now:

SAGE_PARALLEL_SPKG_BUILD=yes
LD_LIBRARY_PATH=/usr/local/gcc-4.5.0/lib:/usr/local/gcc-4.5.0/lib/amd64
PATH=/usr/local/gcc-4.5.0/bin:/usr/local/bins-for-sage/:/usr/local/bin:/usr/bin:/usr/sbin
MAKE=make -j8

and this one crashes with a SIGSEGV. Whereas the env posted above by doesn't.

I am not sure what to do about this? Ask Dave to install a newer GCC to test whether it fails with it as well?

comment:56 in reply to: ↑ 55 Changed 7 years ago by jdemeyer

Replying to malb:

I am not sure what to do about this? Ask Dave to install a newer GCC to test whether it fails with it as well?

That's not a bad suggestion, asking to install gcc 4.5.3 for example (the latest in the 4.5 series)

comment:57 Changed 7 years ago by malb

I conclude it's a compiler bug: I just built with:

SAGE_PARALLEL_SPKG_BUILD=yes
LD_LIBRARY_PATH=/usr/local/gcc-4.6.0/lib:/usr/local/gcc-4.6.0/lib/amd64
PATH=/usr/local/gcc-4.6.0/bin:/usr/local/bins-for-sage/:/usr/local/bin:/usr/bin:/usr/sbin
MAKE=make -j8

and

All tests passed!
Total time for all tests: 1786.1 seconds

I suggest to avoid 4.5.0 (at least on OpenSolaris?) and to change the status of this ticket back to positive review.

comment:58 Changed 7 years ago by jdemeyer

  • Status changed from new to needs_review

comment:59 Changed 7 years ago by jdemeyer

  • Status changed from needs_review to positive_review

comment:60 Changed 7 years ago by jdemeyer

  • Merged in set to sage-4.8.alpha3
  • Resolution set to fixed
  • Status changed from positive_review to closed

Testing again on hawk...

Note: See TracTickets for help on using tickets.