Opened 13 years ago
Closed 10 years ago
#4260 closed enhancement (fixed)
use LinBox as native matrix representation for dense matrices over GF(p)
Reported by: | malb | Owned by: | cpernet |
---|---|---|---|
Priority: | major | Milestone: | sage-4.8 |
Component: | linear algebra | Keywords: | linbox, linear algebra, sd32, sd34 |
Cc: | SimonKing, rbeezer, drkirkby | Merged in: | sage-4.8.alpha3 |
Authors: | Burcin Erocal, Martin Albrecht, Rob Beezer | Reviewers: | Burcin Erocal, Simon King, Martin Albrecht, Jeroen Demeyer |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
Copying to and from LinBox? uses up precious RAM and the point of fast linear algebra is to deal with large matrices. We should consider switching to LinBox? as the native representation of matrices over GF(p)
Without Patch
sage: A = random_matrix(GF(97),2000,2000) sage: %time A*A CPU times: user 9.66 s, sys: 0.12 s, total: 9.77 s Wall time: 9.82 s
With Patch
sage: A = random_matrix(GF(97),2000,2000) sage: %time A*A CPU times: user 1.32 s, sys: 0.00 s, total: 1.32 s Wall time: 1.35 s
Magma
> A:=RandomMatrix(GF(97),2000,2000); > time C:=A*A; Time: 1.560
- Install http://sage.math.washington.edu/home/malb/spkgs/linbox-1.1.6.p5.spkg
- Apply trac_4260-linbox_default.patch
- Apply trac_4260-dense_ctypes_template.patch
- Apply trac_4260-matrix-modn-docs.patch
- Apply trac_4260_more_doctests.patch
- Apply trac_4260_echelonformdomain.patch
- Apply trac_4260-minor_fixes.patch
- Apply trac_4260_bugfix.patch
Attachments (7)
Change History (67)
comment:1 Changed 13 years ago by
- Owner changed from was to cpernet
- Status changed from new to assigned
comment:2 Changed 10 years ago by
- Cc SimonKing added
- Report Upstream set to N/A
comment:3 Changed 10 years ago by
I finally rebased the patch from SD16. The template class in the patch contains the updates made to the modn_dense class since then, like changes to the sig_*
functions. Apparently the modn_dense class representation now allows permuting the rows by changing pointers in the _matrix
array. We can't allow that if we want to pass the _entries
to linbox, so I skipped those changes.
Sage builds with the attached patches, and you can construct matrices. However, there are lots of bugs, some linbox wrappers are still stubs, etc. Expect crashes and wrong results.
With the patch applied, I get a crash with the following:
sage: a = matrix(GF(97),3,4,range(12)) sage: a.echelonize() *** glibc detected *** python: free(): invalid next size (fast): 0x000000000270b370 *** ======= Backtrace: ========= <snip>
AFAICT, the new cython code is an exact copy of the wrapper function in linbox-sage.C
. Here is what valgrind says:
==3026== Invalid write of size 8 ==3026== at 0x39E49EF1: T.4552 (ffpack_ludivine.inl:420) ==3026== by 0x39E49AA0: T.4552 (ffpack_ludivine.inl:486) ==3026== by 0x39E4ABBF: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26M atrix_modn_dense_template_20_echelonize_linbox(_object*, _object*) (ffpack.h:113 2) ==3026== by 0x4E74082: PyObject_Call (abstract.c:2492) ==3026== by 0x39E2CA8A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26M atrix_modn_dense_template_19echelonize(_object*, _object*, _object*) (matrix_mod n_dense_double.cpp:8738) ==3026== by 0x4F17FF9: PyEval_EvalFrameEx (ceval.c:3706) ==3026== by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968) ==3026== by 0x4F19DB1: PyEval_EvalCode (ceval.c:522) ==3026== by 0x4F19083: PyEval_EvalFrameEx (ceval.c:4401) ==3026== by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968) <snip lots more Py* lines> ==3026== Address 0x6ca16e8 is 0 bytes after a block of size 24 alloc'd ==3026== at 0x4C267CE: malloc (vg_replace_malloc.c:236) ==3026== by 0x39E4AA5A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26Matrix_modn_dense_template_20_echelonize_linbox(_object*, _object*) (memory.h:32) ==3026== by 0x4E74082: PyObject_Call (abstract.c:2492) ==3026== by 0x39E2CA8A: __pyx_pf_4sage_6matrix_24matrix_modn_dense_double_26Matrix_modn_dense_template_19echelonize(_object*, _object*, _object*) (matrix_modn_dense_double.cpp:8738) ==3026== by 0x4F17FF9: PyEval_EvalFrameEx (ceval.c:3706) ==3026== by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968) ==3026== by 0x4F19DB1: PyEval_EvalCode (ceval.c:522) ==3026== by 0x4F19083: PyEval_EvalFrameEx (ceval.c:4401) ==3026== by 0x4F19CDC: PyEval_EvalCodeEx (ceval.c:2968) ==3026== by 0x4F18074: PyEval_EvalFrameEx (ceval.c:3802) <snip lots of Py* lines>
I'd appreciate any pointers about the problem above, though I don't know if I'll have the time to come back to this before the bug days in August (when I presume Martin will take over?).
comment:4 Changed 10 years ago by
These are the files in sage/matrix
with failures:
sage -t devel/sage-main/sage/matrix/matrix_cyclo_dense.pyx # 22 doctests failed sage -t devel/sage-main/sage/matrix/strassen.pyx # 2 doctests failed sage -t devel/sage-main/sage/matrix/matrix0.pyx # 2 doctests failed sage -t devel/sage-main/sage/matrix/matrix_integer_dense.pyx # 5 doctests failed sage -t devel/sage-main/sage/matrix/matrix_space.py # 1 doctests failed sage -t devel/sage-main/sage/matrix/matrix_window_modn_dense.pyx # 1 doctests failed sage -t devel/sage-main/sage/matrix/matrix_modn_sparse.pyx # 1 doctests failed sage -t devel/sage-main/sage/matrix/matrix_integer_dense_saturation.py # 0 doctests failed sage -t devel/sage-main/sage/matrix/matrix_rational_dense.pyx # 44 doctests failed sage -t devel/sage-main/sage/matrix/matrix2.pyx # Time out sage -t devel/sage-main/sage/matrix/matrix_modn_dense.pyx # Time out sage -t devel/sage-main/sage/matrix/matrix_modn_dense_template.pxi # Time out
comment:5 Changed 10 years ago by
- Cc rbeezer added
- Description modified (diff)
I fixed a few issues and segfaults but the thing is far from done. However, one can probably do higher level stuff now, i.e. it shouldn't crash that much any more.
We need a new LinBox SPKG because Modular<float>
didn't have a NonZeroRandIter
which is needed by the charpoly code. LinBox 1.1.7 fixes this issue but I tried unsuccessfully to upgrade to 1.1.7 for like 10 hours (cf. #11718).
comment:6 Changed 10 years ago by
- Description modified (diff)
comment:7 Changed 10 years ago by
Doctest failures with most recent patch on sage.math:
sage -t -long -force_lib devel/sage/doc/de/tutorial/tour_advanced.rst # 2 doctests failed sage -t -long -force_lib devel/sage/doc/en/tutorial/tour_advanced.rst # 2 doctests failed sage -t -long -force_lib devel/sage/doc/en/bordeaux_2008/modular_forms_and_hecke_operators.rst # 1 doctests failed sage -t -long -force_lib devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed sage -t -long -force_lib devel/sage/doc/fr/tutorial/tour_advanced.rst # 2 doctests failed sage -t -long -force_lib devel/sage/doc/ru/tutorial/tour_advanced.rst # 2 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modsym/heilbronn.pyx # 2 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modsym/tests.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modsym/space.py # 18 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/eisenstein_submodule.py # 3 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/tests.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/constructor.py # 3 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/space.py # 8 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/ambient.py # 4 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/cuspidal_submodule.py # 6 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modsym/ambient.py # 4 doctests failed sage -t -long -force_lib devel/sage/sage/modular/modform/element.py # 11 doctests failed sage -t -long -force_lib devel/sage/sage/modular/hecke/element.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/modular/hecke/module.py # 3 doctests failed sage -t -long -force_lib devel/sage/sage/modular/abvar/homology.py # 3 doctests failed sage -t -long -force_lib devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed sage -t -long -force_lib devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed sage -t -long -force_lib devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed sage -t -long -force_lib devel/sage/sage/matrix/matrix_cyclo_dense.pyx # 8 doctests failed sage -t -long -force_lib devel/sage/sage/matrix/matrix2.pyx # 1 doctests failed sage -t -long -force_lib devel/sage/sage/tests/cmdline.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed sage -t -long -force_lib devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed sage -t -long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed sage -t -long -force_lib devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed sage -t -long -force_lib devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed sage -t -long -force_lib devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed
comment:8 Changed 10 years ago by
- Description modified (diff)
I've fixed all the easy stuff which brings the doctest failures down to:
sage -t -long devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed sage -t -long devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed sage -t -long devel/sage/sage/modular/modsym/space.py # 18 doctests failed sage -t -long devel/sage/sage/modular/modform/eisenstein_submodule.py # 3 doctests failed sage -t -long devel/sage/sage/modular/modform/constructor.py # 3 doctests failed sage -t -long devel/sage/sage/modular/modform/space.py # 8 doctests failed sage -t -long devel/sage/sage/modular/modform/ambient.py # 4 doctests failed sage -t -long devel/sage/sage/modular/hecke/element.py # 1 doctests failed sage -t -long devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed sage -t -long devel/sage/sage/modular/hecke/module.py # 3 doctests failed sage -t -long devel/sage/sage/modular/abvar/homology.py # 3 doctests failed sage -t -long devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed sage -t -long devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed sage -t -long devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed sage -t -long devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed sage -t -long devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed sage -t -long devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed sage -t -long devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed sage -t -long devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed sage -t -long devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed sage -t -long devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed
many of which seem to be caused by a small set of bugs.
comment:9 Changed 10 years ago by
Here's where we are at on sage.math:
sage -t devel/sage/doc/en/bordeaux_2008/elliptic_curves.rst # 4 doctests failed sage -t devel/sage/sage/modular/modsym/subspace.py # 9 doctests failed sage -t devel/sage/sage/modular/modsym/space.py # 12 doctests failed sage -t devel/sage/sage/modular/modform/eisenstein_submodule.py # 1 doctests failed sage -t devel/sage/sage/modular/modform/space.py # 7 doctests failed sage -t devel/sage/sage/modular/modform/constructor.py # 1 doctests failed sage -t devel/sage/sage/modular/modform/ambient.py # 4 doctests failed sage -t devel/sage/sage/modular/hecke/element.py # 1 doctests failed sage -t devel/sage/sage/modular/hecke/hecke_operator.py # 1 doctests failed sage -t devel/sage/sage/modular/hecke/module.py # 3 doctests failed sage -t devel/sage/sage/modular/abvar/homology.py # 3 doctests failed sage -t devel/sage/sage/modular/abvar/torsion_subgroup.py # 4 doctests failed sage -t devel/sage/sage/modular/hecke/submodule.py # 3 doctests failed sage -t devel/sage/sage/modular/abvar/abvar.py # 4 doctests failed sage -t devel/sage/sage/structure/sage_object.pyx # 1 doctests failed sage -t devel/sage/sage/combinat/symmetric_group_representations.py # 1 doctests failed sage -t devel/sage/sage/schemes/elliptic_curves/padics.py # 29 doctests failed sage -t devel/sage/sage/schemes/elliptic_curves/padic_lseries.py # 6 doctests failed sage -t devel/sage/sage/schemes/elliptic_curves/ell_modular_symbols.py # 2 doctests failed sage -t devel/sage/sage/schemes/generic/toric_chow_group.py # 16 doctests failed sage -t devel/sage/sage/schemes/elliptic_curves/sha_tate.py # 10 doctests failed sage -t devel/sage/sage/schemes/elliptic_curves/ell_rational_field.py # 1 doctests failed
comment:10 Changed 10 years ago by
- Work issues set to sd32
comment:11 Changed 10 years ago by
With the updated patch we are down to:
sage -t -long devel/sage/sage/modular/modsym/heilbronn.pyx # 2 doctests failed sage -t -long devel/sage/sage/modular/abvar/homology.py # 3 doctests failed
However, there also seems to be a doctest failure in matrix2.pyx
which is not that easily reproduced.
comment:12 Changed 10 years ago by
- Status changed from new to needs_review
Now all doctests should pass!
comment:13 Changed 10 years ago by
- Keywords sd32 added
- Work issues sd32 deleted
comment:14 Changed 10 years ago by
- Description modified (diff)
- Work issues set to extend documentation
comment:15 Changed 10 years ago by
I adapted the crossover from float
to double
. Around 2^{11} Modular<float>
} is really slow because there are not enough bits left to let ATLAS do it's magic, i.e., too many modular reductions. On my computer using Modular<float>
up to 2^{8} seems like a good choice. On sage.math this choice isn't too bad (but not optimal). Multiplying two 1,000 x 1,000 matrices over GF(p) (2nd column) which is smaller than 2^{i} (1st column) and the time it takes:
2 3 0.22000 3 7 0.24000 4 13 0.24000 5 31 0.25000 6 61 0.26000 7 127 0.26000 8 251 0.62000 9 509 0.38000 <=== using Modular<double> now 10 1021 0.38000 11 2039 0.39000 12 4093 0.39000 13 8191 0.40000 14 16381 0.41000 15 32749 0.41000 16 65521 0.42000 17 131071 0.43000 18 262139 0.43000 19 524287 0.44000 20 1048573 0.44000 21 2097143 0.45000 22 4194301 0.66000 23 8388593 1.91000 <=== Generic matrices
comment:16 Changed 10 years ago by
I found that the time for computing echelon form became worse:
sage-4.6.2
sage: MS = MatrixSpace(GF(101),2000,2000) sage: %time A = MS.random_element() CPU times: user 0.17 s, sys: 0.03 s, total: 0.20 s Wall time: 0.20 s sage: B = MS.random_element() sage: %time C = A*B CPU times: user 8.35 s, sys: 0.07 s, total: 8.42 s Wall time: 8.45 s sage: %time A.echelonize() CPU times: user 1.22 s, sys: 0.06 s, total: 1.28 s Wall time: 1.38 s
sage-4.7.2.alpha2 with the patches and spkg from here:
sage: MS = MatrixSpace(GF(101),2000,2000) sage: %time A = MS.random_element() CPU times: user 0.19 s, sys: 0.03 s, total: 0.22 s Wall time: 0.22 s sage: B = MS.random_element() sage: %time C = A*B CPU times: user 1.16 s, sys: 0.02 s, total: 1.17 s Wall time: 1.22 s sage: %time A.echelonize() CPU times: user 1.87 s, sys: 0.00 s, total: 1.87 s Wall time: 1.92 s
Changed 10 years ago by
Changed 10 years ago by
comment:17 follow-up: ↓ 19 Changed 10 years ago by
- Description modified (diff)
- renamed Rob's patch to fix ticket number
- (hopefully) added doctests for every single function
- Simon, can you try again after setting
MAX_MODULUS
in sage.matrix.matrix_modn_dense_float to 2^{6}? This forces the use of doubles for GF(101) which might be more efficient. Also, how fast isA.echelonize('gauss')
for you on that benchmark?
comment:18 Changed 10 years ago by
- Description modified (diff)
comment:19 in reply to: ↑ 17 Changed 10 years ago by
Replying to malb:
- Simon, can you try again after setting
MAX_MODULUS
in sage.matrix.matrix_modn_dense_float to 2^{6}? This forces the use of doubles for GF(101) which might be more efficient.
It isn't:
sage: sage.matrix.matrix_modn_dense_float.MAX_MODULUS = 2^6 sage: MS = MatrixSpace(GF(101),2000,2000) sage: %time A = MS.random_element() CPU times: user 0.21 s, sys: 0.01 s, total: 0.22 s Wall time: 0.22 s sage: B = MS.random_element() sage: %time C = A*B CPU times: user 1.88 s, sys: 0.04 s, total: 1.92 s Wall time: 1.93 s sage: %time A.echelonize() CPU times: user 2.65 s, sys: 0.00 s, total: 2.65 s Wall time: 2.69 s sage: type(A) <type 'sage.matrix.matrix_modn_dense_double.Matrix_modn_dense_double'>
Also, how fast is
A.echelonize('gauss')
for you on that benchmark?
You mean "how slow", I suppose:
sage: A = MS.random_element() sage: %time A.echelonize('gauss') CPU times: user 41.53 s, sys: 0.10 s, total: 41.63 s Wall time: 41.75 s
comment:20 Changed 10 years ago by
- Work issues changed from extend documentation to improve echelonize
comment:21 Changed 10 years ago by
A word about the regression (I'm copying my reply to malb on linbox-devel)
The new code (that I wrote )
size_t r = FFPACK::ReducedRowEchelonForm(F, nrows, ncols, matrix, ncols, P,Q);
calls the actual RowEchelon? elimination in FFPACK, which transforms A into its redrowechelon form E and the transformation matrix U (both matrices being magically stored inplace in A)
It is slower than the older code sage-4.6 using linbox-1.1.6:
int rank = EF.rowReducedEchelon(E, A);
The latter computes the redrowechlon (actually the trans of the redcolechelon), but no transformation matrix. This saves roughly 50% of the total number of arithmetic ops (1n^{3 rather than 2n}3), and explains the regression.
Switching back to the old way should fix the regression (for a quick fix). And I still need to add the feature of not computing the transform at the level of FFPACK, since I expect some timing improvements over the old version in linbox 1.1.6.
comment:22 Changed 10 years ago by
Hi Clement,
thanks for explaining. I always forget about the transformation matrix (e.g., that in fact M4RI is even faster than Magma than previously thought, because we always compute the transformation matrix and yet we are faster :)). I'll try to "switch back". Btw. is there a way to construct the right matrices without copying?
PS: I didn't get your reply on [linbox-devel] btw.
comment:23 Changed 10 years ago by
- Description modified (diff)
The additional patch makes us use the old EchelonFormDomain? interface which is twice as fast (as Clément explained). Simon, can you review this ticket?
Changed 10 years ago by
comment:24 Changed 10 years ago by
- Keywords sd34 added
- Reviewers set to Burcin Erocal
attachment:trac_4260-minor_fixes.patch
- makes some cosmetic changes and
- fixes a possible memory leak if allocation of these matrices fail.
I read through the patches and the resulting code. All looks good to me. Please switch this to positive review if my patch is ok.
Thanks everyone for finally finishing this off.
comment:25 Changed 10 years ago by
- Reviewers changed from Burcin Erocal to Burcin Erocal, Simon King, Martin Albrecht
- Status changed from needs_review to positive_review
Burcin's patch looks good. Thus, giving it a positive review. I'm also running doctests again against 4.7.2.alpha3.
comment:26 Changed 10 years ago by
- Description modified (diff)
comment:27 Changed 10 years ago by
Doctests indeed pass on sage.math.
comment:28 Changed 10 years ago by
- Milestone changed from sage-4.7.2 to sage-4.7.3
- Work issues improve echelonize deleted
comment:29 Changed 10 years ago by
- Status changed from positive_review to needs_work
- Work issues set to cleanup spkg
The spkg could do with some cleanup:
- What is the purpose of
spkg-rebuild
? If it is not used, remove it. If it is used, document it. spkg-debian
and thedist
directory should be removed. They are leftovers for Debian, but these are now being removed from every spkg.- Why is "linbox" in
.hgignore
? - The file
patches/commentator.C
lacks a corresponding.patch
file.
Optional:
- Make
spkg-install
usepatch
for patching.
comment:30 Changed 10 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
What is the purpose of spkg-rebuild? If it is not used, remove it. If it is used, document it.
removed
spkg-debian and the dist directory should be removed. They are leftovers for Debian, but these are now being removed from every spkg.
removed
Why is "linbox" in .hgignore?
removed
The file patches/commentator.C lacks a corresponding .patch file.
added
Make spkg-install use patch for patching.
left for another time
comment:31 Changed 10 years ago by
- Merged in set to sage-4.7.3.alpha0
- Resolution set to fixed
- Status changed from needs_review to closed
- Work issues cleanup spkg deleted
comment:32 Changed 10 years ago by
- Merged in sage-4.7.3.alpha0 deleted
- Resolution fixed deleted
- Status changed from closed to new
Unfortunately, there are failures on OS X 10.4 PPC, all in the file sage/matrix/matrix_modn_dense_double.pyx
:
sage -t -long -force_lib devel/sage/sage/matrix/matrix_modn_dense_double.pyx ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 71: sage: A[0,0] = 220082r; A Expected: [ 220082 2824836 765701 2282256] [1795330 767112 2967421 1373921] [2757699 1142917 2720973 2877160] [1674049 1341486 2641133 2173280] Got: [ 0 2824836 765701 2282256] [1795330 767112 2967421 1373921] [2757699 1142917 2720973 2877160] [1674049 1341486 2641133 2173280] ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 76: sage: a = A[0,0]; a Expected: 220082 Got: 0 ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 78: sage: ~a Exception raised: Traceback (most recent call last): File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_2[5]>", line 1, in <module> ~a###line 78: sage: ~a File "integer_mod.pyx", line 3240, in sage.rings.finite_rings.integer_mod.IntegerMod_int64.__invert__ (sage/rings/finite_rings/integer_mod.c:25534) File "integer_mod.pyx", line 3371, in sage.rings.finite_rings.integer_mod.mod_inverse_int64 (sage/rings/finite_rings/integer_mod.c:26331) ZeroDivisionError: Inverse does not exist. ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 86: sage: A[0,0] = 220082r; A Expected: [ 220082 1237101 2033003 3788106] [4649912 1157595 4928315 4382585] [4252686 978867 2601478 1759921] [1303120 1860486 3405811 2203284] Got: [ 0 1237101 2033003 3788106] [4649912 1157595 4928315 4382585] [4252686 978867 2601478 1759921] [1303120 1860486 3405811 2203284] ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 91: sage: a = A[0,0]; a Expected: 220082 Got: 0 ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 93: sage: a*a Expected: 4777936 Got: 0 ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 112: sage: A[0,0] = K(220082); A Expected: [ 220082 2824836 765701 2282256] [1795330 767112 2967421 1373921] [2757699 1142917 2720973 2877160] [1674049 1341486 2641133 2173280] Got: [ 0 2824836 765701 2282256] [1795330 767112 2967421 1373921] [2757699 1142917 2720973 2877160] [1674049 1341486 2641133 2173280] ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 117: sage: a = A[0,0]; a Expected: 220082 Got: 0 ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 119: sage: ~a Exception raised: Traceback (most recent call last): File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/Users/jdemeyer/sage-4.7.3.alpha0/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_3[6]>", line 1, in <module> ~a###line 119: sage: ~a File "integer_mod.pyx", line 3240, in sage.rings.finite_rings.integer_mod.IntegerMod_int64.__invert__ (sage/rings/finite_rings/integer_mod.c:25534) File "integer_mod.pyx", line 3371, in sage.rings.finite_rings.integer_mod.mod_inverse_int64 (sage/rings/finite_rings/integer_mod.c:26331) ZeroDivisionError: Inverse does not exist. ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 129: sage: a = A[0,0]; a Expected: 220081 Got: 0 ********************************************************************** File "/Users/jdemeyer/sage-4.7.3.alpha0/devel/sage-main/sage/matrix/matrix_modn_dense_double.pyx", line 131: sage: a*a Expected: 4337773 Got: 0 **********************************************************************
comment:33 Changed 10 years ago by
- Status changed from new to needs_review
comment:34 Changed 10 years ago by
- Status changed from needs_review to needs_work
comment:35 follow-ups: ↓ 36 ↓ 39 Changed 10 years ago by
If I understand the errors correctly it's only setting elements which goes wrong, so I assume some cast doesn't work on PPC OSX 10.4. Can I get access to such a machine somehow?
comment:36 in reply to: ↑ 35 Changed 10 years ago by
Replying to malb:
If I understand the errors correctly it's only setting elements which goes wrong, so I assume some cast doesn't work on PPC OSX 10.4. Can I get access to such a machine somehow?
The machine that I'm using for this testing is not mine, but I can ask the owner. I guess it will be okay for him to make an account for you. He is on holidays now (I think up to sunday), so it will take a few days.
comment:37 Changed 10 years ago by
Thanks.
comment:39 in reply to: ↑ 35 Changed 10 years ago by
- Milestone set to sage-4.8
Replying to malb:
If I understand the errors correctly it's only setting elements which goes wrong, so I assume some cast doesn't work on PPC OSX 10.4. Can I get access to such a machine somehow?
Yes, see private email.
Changed 10 years ago by
comment:40 Changed 10 years ago by
- Description modified (diff)
- Status changed from needs_work to needs_review
The attached patch fixes the issue on the machine in question. We forgot to deal with 32-bit systems in setting elements while we did it for getting elements.
comment:41 Changed 10 years ago by
Why do you write
ceil(sqrt(2^31-1)) < 2^23
It is a true statement, but where does the "23" come from? You could write
ceil(sqrt(2^31-1)) = 46341
comment:42 Changed 10 years ago by
2^{23 is the maximum modulus of LinBox?'s double based matrix representation and writing 2}23 is easier to read than 46241.
comment:43 Changed 10 years ago by
- Merged in set to sage-4.8.alpha2
- Resolution set to fixed
- Reviewers changed from Burcin Erocal, Simon King, Martin Albrecht to Burcin Erocal, Simon King, Martin Albrecht, Jeroen Demeyer
- Status changed from needs_review to closed
Works on OS X 10.4 PPC, so positive review.
comment:44 Changed 10 years ago by
This crashes all over the place on OpenSolaris? 06.2009-32 (hawk). For example:
sage -t -long -force_lib devel/sage/sage/rings/qqbar.py ********************************************************************** File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage-main/sage/rings/qqbar.py", line 241: sage: r.imag().minpoly() # this takes a long time (143s on my laptop) Exception raised: Traceback (most recent call last): File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_0[74]>", line 1, in <module> r.imag().minpoly() # this takes a long time (143s on my laptop)###line 241: sage: r.imag().minpoly() # this takes a long time (143s on my laptop) File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/lib/python/site-packages/sage/rings/qqbar.py", line 2873, in minpoly self._minimal_polynomial = self._descr.minpoly() File "/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/local/lib/python/site-packages/sage/rings/qqbar.py", line 5406, in minpoly self._minpoly = self._value.minpoly() File "number_field_element.pyx", line 3495, in sage.rings.number_field.number_field_element.NumberFieldElement_absolute.minpoly (sage/rings/number_field/number_field_element.cpp:21939) File "number_field_element.pyx", line 3462, in sage.rings.number_field.number_field_element.NumberFieldElement_absolute.charpoly (sage/rings/number_field/number_field_element.cpp:21816) File "matrix_rational_dense.pyx", line 936, in sage.matrix.matrix_rational_dense.Matrix_rational_dense.charpoly (sage/matrix/matrix_rational_dense.c:10895) File "matrix_integer_dense.pyx", line 1017, in sage.matrix.matrix_integer_dense.Matrix_integer_dense.charpoly (sage/matrix/matrix_integer_dense.c:10961) File "matrix_integer_dense.pyx", line 1074, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._charpoly_linbox (sage/matrix/matrix_integer_dense.c:11601) File "matrix_integer_dense.pyx", line 1096, in sage.matrix.matrix_integer_dense.Matrix_integer_dense._poly_linbox (sage/matrix/matrix_integer_dense.c:11869) RuntimeError: Segmentation fault **********************************************************************
There are many more like this.
comment:45 follow-up: ↓ 48 Changed 10 years ago by
Mhh, the trouble is in Matrix_integer_dense, which isn't what this ticket is about, so that's curious. How do I log into hawk?
comment:46 Changed 10 years ago by
I still want to investigate some more, for example I have not checked that it is really this ticket which causes the problems (but you do see "linbox" appearing in the backtrace).
Strangely, even building the documentation crashes:
sphinx-build -b html -d /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/doctrees/en/reference -A hide_pdf_links=1 /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/en/reference /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference Running Sphinx v1.1.2 loading pickled environment... not yet created building [html]: targets for 935 source files that are out of date updating environment: 935 added, 0 changed, 0 removed reading sources... [ 0%] algebras reading sources... [ 0%] arithgroup reading sources... [ 0%] calculus reading sources... [ 0%] categories reading sources... [ 0%] cmd reading sources... [ 0%] coding reading sources... [ 0%] coercion reading sources... [ 0%] combinat/algebra reading sources... [ 0%] combinat/crystals [...] writing additional files... genindex py-modindex search copying images... [ 16%] sage/graphs/../../media/heawood-graph-latex.png copying images... [ 33%] sage/homology/../../media/homology/simplices.png copying images... [ 50%] sage/homology/../../media/homology/torus.png copying images... [ 66%] sage/homology/../../media/homology/klein.png copying images... [ 83%] sage/homology/../../media/homology/rp2.png copying images... [100%] sage/homology/../../media/homology/torus_labelled.png copying static files... done dumping search index... done dumping object inventory... done build succeeded. ------------------------------------------------------------------------ Unhandled SIGSEGV: A segmentation fault occurred in Sage. This probably occurred because a *compiled* component of Sage has a bug in it and is not properly wrapped with sig_on(), sig_off(). You might want to run Sage under gdb with 'sage -gdb' to debug this. Sage will now terminate. ------------------------------------------------------------------------ Build finished. The built documents can be found in /export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2/devel/sage/doc/output/html/en/reference
comment:47 Changed 10 years ago by
- Merged in sage-4.8.alpha2 deleted
- Resolution fixed deleted
- Status changed from closed to new
comment:48 in reply to: ↑ 45 Changed 10 years ago by
- Cc drkirkby added
Replying to malb:
Mhh, the trouble is in Matrix_integer_dense, which isn't what this ticket is about, so that's curious. How do I log into hawk?
Hawk is a machine from David Kirkby, so you should ask him.
comment:49 Changed 10 years ago by
Okay, I managed to build 4.8-alpha2 + this ticket on hawk. Just starting and stopping Sage gives:
#0 0xfec9c7fb in _free_unlocked () from /lib/libc.so.1 #1 0xfec9c7af in free () from /lib/libc.so.1 #2 0xfdb81d01 in operator delete (ptr=0x8) at ../../../../gcc-4.5.0/libstdc++-v3/libsupc++/del_op.cc:44 #3 0xfdb81d5d in operator delete[] (ptr=0x8) at ../../../../gcc-4.5.0/libstdc++-v3/libsupc++/del_opv.cc:32 #4 0xfdb72543 in ~ios_base (this=0xf9a3e704) at ../../../../gcc-4.5.0/libstdc++-v3/src/ios.cc:93 #5 0xf9892891 in __static_initialization_and_destruction_0 (__initialize_p=<value optimized out>) at /usr/local/gcc-4.5.0/lib/gcc/i386-pc-solaris2.10/4.5.0/../../../../include/c++/4.5.0/bits/basic_ios.h:272 #6 0xf988e3b0 in __do_global_dtors_aux () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0 #7 0xf99f7835 in _fini () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0 #8 0xfefd15fe in call_fini () from /usr/lib/ld.so.1 #9 0xfefd17b3 in atexit_fini () from /usr/lib/ld.so.1 #10 0xfec8370c in _exithandle () from /lib/libc.so.1 #11 0xfec73f52 in exit () from /lib/libc.so.1 #12 0xfeef3232 in Py_Exit (sts=0) at Python/pythonrun.c:1716 #13 0xfeef3357 in handle_system_exit () at Python/pythonrun.c:1116 #14 0x00000000 in ?? ()
So it tries to clean up LinBox? at the end and that's when things go wrong:
_fini () from /export/home/martina/sage-4.8.alpha2/local/lib//liblinboxsage.so.0
any ideas about why?
comment:50 follow-up: ↓ 51 Changed 10 years ago by
Weird, I rebuilt everything from scratch using these environment variables
SAGE_PARALLEL_SPKG_BUILD=yes LD_LIBRARY_PATH=/usr/local/lib PATH=/usr/local/bins-for-sage:/usr/local/bin:/usr/bin:/bin MAKE=make -j4
and now
All tests passed! Total time for all tests: 1742.8 seconds
i.e., the segfault is gone. How does the buildbot build Sage?
comment:51 in reply to: ↑ 50 Changed 10 years ago by
Replying to malb:
i.e., the segfault is gone. How does the buildbot build Sage?
EDITOR=emacs HISTCONTROL=ignoreboth HISTSIZE=2000 HOME=/export/home/buildbot IGNOREEOF=100 LANG=C LD_LIBRARY_PATH=/usr/local/gcc-4.5.0/lib:/usr/local/gcc-4.5.0/lib/amd64 LESS=iMqR LESSHISTFILE=- LOGNAME=buildbot MAIL=/var/mail/buildbot MAKE=make -j12 MAKEOPTS=-j12 PAGER=/usr/bin/less PATH=/export/home/buildbot/bin:/export/home/buildbot/local/hawk/bin:/usr/local/bins-for-sage:/usr/local/gcc-4.5.0/bin:/usr/local/bin:/usr/local/texlive/2010/bin/i386-solaris/:/usr/bin:/usr/sbin PWD=/export/home/buildbot/build/sage/hawk-1/hawk_full/build/sage-4.8.alpha2 SAGE_ATLAS_LIB=/ATLAS32 SAGE_FORTRAN=/usr/local/gcc-4.5.0/bin/gfortran SAGE_FORTRAN_LIB=/usr/local/gcc-4.5.0/lib/libgfortran.so SAGE_PARALLEL_SPKG_BUILD=yes SAGE_PORT=true SHELL=/bin/bash SHLVL=1 SSH_CLIENT=128.208.160.197 44994 22 SSH_CONNECTION=128.208.160.197 44994 192.168.1.191 22 SSH_TTY=/dev/pts/2 TERM=screen TZ=Europe/London USER=buildbot VIRTUAL_ENV=/export/home/buildbot/local/hawk VIRTUAL_ENV_DISABLE_PROMPT=yes VISUAL=emacs
comment:52 follow-up: ↓ 53 Changed 10 years ago by
Perhaps, it's a GCC 4.5.0 issue?
comment:53 in reply to: ↑ 52 Changed 10 years ago by
Replying to malb:
Perhaps, it's a GCC 4.5.0 issue?
Could very well be.
What does your gcc --version say? (the gcc you used to compile Linbox successfully)
comment:54 Changed 10 years ago by
$ gcc --version gcc (GCC) 4.4.3 20100112 (prerelease) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
comment:55 follow-up: ↓ 56 Changed 10 years ago by
I can confirm that this bug is at least triggered by GCC 4.5.
Here's the relevant bits of the env that I used to build Sage + this ticket just now:
SAGE_PARALLEL_SPKG_BUILD=yes LD_LIBRARY_PATH=/usr/local/gcc-4.5.0/lib:/usr/local/gcc-4.5.0/lib/amd64 PATH=/usr/local/gcc-4.5.0/bin:/usr/local/bins-for-sage/:/usr/local/bin:/usr/bin:/usr/sbin MAKE=make -j8
and this one crashes with a SIGSEGV. Whereas the env posted above by doesn't.
I am not sure what to do about this? Ask Dave to install a newer GCC to test whether it fails with it as well?
comment:56 in reply to: ↑ 55 Changed 10 years ago by
Replying to malb:
I am not sure what to do about this? Ask Dave to install a newer GCC to test whether it fails with it as well?
That's not a bad suggestion, asking to install gcc 4.5.3 for example (the latest in the 4.5 series)
comment:57 Changed 10 years ago by
I conclude it's a compiler bug: I just built with:
SAGE_PARALLEL_SPKG_BUILD=yes LD_LIBRARY_PATH=/usr/local/gcc-4.6.0/lib:/usr/local/gcc-4.6.0/lib/amd64 PATH=/usr/local/gcc-4.6.0/bin:/usr/local/bins-for-sage/:/usr/local/bin:/usr/bin:/usr/sbin MAKE=make -j8
and
All tests passed! Total time for all tests: 1786.1 seconds
I suggest to avoid 4.5.0 (at least on OpenSolaris?) and to change the status of this ticket back to positive review.
comment:58 Changed 10 years ago by
- Status changed from new to needs_review
comment:59 Changed 10 years ago by
- Status changed from needs_review to positive_review
comment:60 Changed 10 years ago by
- Merged in set to sage-4.8.alpha3
- Resolution set to fixed
- Status changed from positive_review to closed
Testing again on hawk...
I will work on it as a coding sprint at SD10.