Opened 3 years ago
Closed 3 years ago
#25476 closed defect (fixed)
MeatAxerelated bug introduced in 8.3.beta
Reported by:  SimonKing  Owned by:  

Priority:  blocker  Milestone:  sage8.3 
Component:  packages: optional  Keywords:  meataxe rescale_row 
Cc:  vdelecroix, tscrim  Merged in:  
Authors:  Travis Scrimshaw, Simon King  Reviewers:  Simon King, Travis Scrimshaw 
Report Upstream:  N/A  Work issues:  
Branch:  e2f0550 (Commits, GitHub, GitLab)  Commit:  e2f0550696a5dbc9e0cd595d1eda7d9b9048d06d 
Dependencies:  Stopgaps: 
Description
Vincent reports this for SageMath 8.3.beta3 with MeatAxe installed:
File "src/sage/coding/linear_code.py", line 2039, in sage.coding.linear_code.AbstractLinearCode.__getitem__ Failed example: C = random_matrix(GF(25,'a'), 2, 7).row_space() Exception raised: Traceback (most recent call last): File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 572, in _run self.compile_and_execute(example, compiler, test.globs) File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 982, in compile_and_execute exec(compiled, globs) File "<doctest sage.coding.linear_code.AbstractLinearCode.__getitem__[4]>", line 1, in <module> C = random_matrix(GF(Integer(25),'a'), Integer(2), Integer(7)).row_space() File "sage/matrix/matrix2.pyx", line 4592, in sage.matrix.matrix2.Matrix.row_space (build/cythonized/sage/matrix/matrix2.c:34495) return self.row_module(base_ring=base_ring) File "sage/matrix/matrix2.pyx", line 4565, in sage.matrix.matrix2.Matrix.row_module (build/cythonized/sage/matrix/matrix2.c:34373) return M.span(self.rows(), already_echelonized=False) File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/modules/free_module.py", line 3850, in span self.ambient_module(), gens=gens, check=check, already_echelonized=already_echelonized) File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/modules/free_module.py", line 6961, in __init__ echelonize=not already_echelonized, already_echelonized=already_echelonized) File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/modules/free_module.py", line 6763, in __init__ echelonized_basis=echelonized_basis, already_echelonized=already_echelonized) File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/modules/free_module.py", line 5691, in __init__ basis = self._echelonized_basis(ambient, basis) File "/home/vdelecro/sage_patchbot/local/lib/python2.7/sitepackages/sage/modules/free_module.py", line 6894, in _echelonized_basis E = A.echelon_form() File "sage/matrix/matrix2.pyx", line 6819, in sage.matrix.matrix2.Matrix.echelon_form (build/cythonized/sage/matrix/matrix2.c:53810) v = E.echelonize(cutoff=cutoff, **kwds) File "sage/matrix/matrix2.pyx", line 6719, in sage.matrix.matrix2.Matrix.echelonize (build/cythonized/sage/matrix/matrix2.c:53210) self._echelon_in_place(algorithm) File "sage/matrix/matrix2.pyx", line 6984, in sage.matrix.matrix2.Matrix._echelon_in_place (build/cythonized/sage/matrix/matrix2.c:54847) A.rescale_row(r, a_inverse, c) File "sage/matrix/matrix0.pyx", line 3005, in sage.matrix.matrix0.Matrix.rescale_row (build/cythonized/sage/matrix/matrix0.c:21880) self.rescale_row_c(i, s, start_col) File "sage/matrix/matrix_gfpn_dense.pyx", line 944, in sage.matrix.matrix_gfpn_dense.Matrix_gfpn_dense.rescale_row_c (build/cythonized/sage/matrix/matrix_gfpn_dense.c:9339) raise ValueError("We can only rescale a full row of a nonempty matrix") ValueError: We can only rescale a full row of a nonempty matrix **********************************************************************
The above error does not occur in 8.2.rc4. Let's try to prevent it from being published in 8.3!
Change History (41)
comment:1 Changed 3 years ago by
comment:3 Changed 3 years ago by
 Priority changed from critical to blocker
So this seems to be caused by #17272, but I do not understand why. It is the exact same code for the classical echelon algorithm. The only other difference is that has become a cpdef
method instead of a def
method, but changing that didn't seem to fix the problem. At least, where the error is happening, it means we would have just inverted a 0 if the row is all 0. So perhaps this is a bug in Cython?
Also, failing tests with optional packages is a blocker.
comment:4 Changed 3 years ago by
PS  I think the machine I tested #17272 on did not have meataxe
installed (unlike my usual devbox), but I thought it did...(cannot check it for the next ~2 months).
comment:5 Changed 3 years ago by
I really have no idea what is going on. It does work before #17272, but fails after. Changing both the cpdef
> def
doesn't fix it. The algorithm is exactly the same and even reverting that part of the code does not fix it.
comment:6 Changed 3 years ago by
While trying to build the latest beta, cddlib fails. Very bad. So, I still cannot test stuff.
comment:7 Changed 3 years ago by
Fortunately retrying make cddlib
worked.
comment:8 Changed 3 years ago by
Now I completed building the latest beta and can confirm the bug.
comment:9 Changed 3 years ago by
PS: It is not the creation of the matrix but calling .row_space()
.
comment:10 Changed 3 years ago by
OK, the problem is that the algorithm has been changed so that only a part of a row was rescaled. That's quite reasonable to do, and actually one of the changes of SharedMeatAxe compared to MeatAxe is that in the school book gaußian elimination implementation I made it so that the already annulated part of a row is not rescaled.
Only in the wrapper in Sage I did not expose that functionality. I guess that needs to change.
comment:11 Changed 3 years ago by
That part of Gaußian elimination was not changed by #17272.
Ah, I found the problem: the method was renamed _echelon_in_place_classical
> _echelon_in_place
. So the generic matrix code is no longer calling the specialized version.
comment:12 Changed 3 years ago by
I will push a fix in 1 second.
comment:13 Changed 3 years ago by
 Branch set to public/linear_algebras/fix_meataxe_echelon25476
 Commit set to 4eae803913d67d3f46f58e56d3524c9b4f73b002
 Status changed from new to needs_review
Okay, so this fixes the problem. It has a bit of technical debt as it pushes a potentially bigger issue design issue down the road, but it works. I also do similar fixes for other matrices that may not be using their optimized code. A bigger design discussion on matrices is likely needed.
New commits:
4eae803  Making matrices use the new _echelon_in_place method.

comment:14 Changed 3 years ago by
So, you think it won't be needed to allow !Matrix_gfpn_dense to perform a partial row rescaling? Would it nonetheless be a good idea to implement?
comment:15 followup: ↓ 16 Changed 3 years ago by
I definitely think it would be good to do partial row rescaling; it probably will give you a speed boost. Although that this actually indicated an error turned out to be a good thing indicating something I missed when reviewing #17272.
comment:16 in reply to: ↑ 15 ; followup: ↓ 17 Changed 3 years ago by
Replying to tscrim:
I definitely think it would be good to do partial row rescaling; it probably will give you a speed boost. Although that this actually indicated an error turned out to be a good thing indicating something I missed when reviewing #17272.
Then I propose to provide an enhancement on top of your changes. I just checked: With your branch the tests in src/sage/matrix pass. And then I hope we can review each other's commits.
comment:17 in reply to: ↑ 16 Changed 3 years ago by
Replying to SimonKing:
Replying to tscrim:
I definitely think it would be good to do partial row rescaling; it probably will give you a speed boost. Although that this actually indicated an error turned out to be a good thing indicating something I missed when reviewing #17272.
Then I propose to provide an enhancement on top of your changes. I just checked: With your branch the tests in src/sage/matrix pass. And then I hope we can review each other's commits.
In principle, I would fine with it. However, given this is a blocker issue, I think this would be better to do so on a followup (which I will happily review).
comment:18 followup: ↓ 19 Changed 3 years ago by
So far I got a timeout in sage t src/sage/rings/padics/padic_lattice_element.py
.
comment:19 in reply to: ↑ 18 Changed 3 years ago by
Replying to SimonKing:
So far I got a timeout in
sage t src/sage/rings/padics/padic_lattice_element.py
.
That has been noticed by a few other people.
comment:20 Changed 3 years ago by
sage t src/sage/rings/padics/padic_lattice_element.py
and sage t src/sage/rings/padics/padic_base_leaves.py
time out consistently, even when tested separately.
I cannot imagine that it is related with the issue we are dealing with here  hence, I'll try if these two tests fail without the patch from here.
comment:21 Changed 3 years ago by
Yep, I get the same timeouts. So, I believe the timeouts are not related with this ticket and thus shouldn't prevent a review.
The code looks good. But shouldn't there be a test that demonstrates that the bug is fixed? Or are the existing tests good enough?
comment:22 Changed 3 years ago by
Tests pass for me on OS X with this branch, on a system with MeatAxe installed. I haven't tried without MeatAxe present.
comment:23 followup: ↓ 24 Changed 3 years ago by
I would say the existing tests are good enough since they were breaking before this change.
comment:24 in reply to: ↑ 23 Changed 3 years ago by
 Reviewers set to Simon King
 Status changed from needs_review to positive_review
Replying to tscrim:
I would say the existing tests are good enough since they were breaking before this change.
In this case I can give a positive review.
comment:25 followup: ↓ 26 Changed 3 years ago by
 Status changed from positive_review to needs_work
See patchbot
comment:26 in reply to: ↑ 25 Changed 3 years ago by
Replying to vbraun:
See patchbot
The patchbot complains that there is a nonascii character. But I cannot identify it. So, what is it?
The patchbot complains about pyflakes. But the error is "NameError?: name 'isPythonFile' is not defined". Doesn't that mean there is a bug in the plugin, not in the changeset?
The patchbot complains about startup time. I don't know how to interpret the numbers, so I don't know how much of a problem we have here.
The test error is reproducible as follows:
sage: K.<x> = GF(25) sage: M = MatrixSpace(K,9,3,implementation='generic')() sage: entries = [((0, 2), x), ((0, 4), 3*x + 2), ((0, 8), 2*x), ((1, 1), x + 3), ((1, 5), 3*x), ((1, 6), x + 4), ((2, 0), 2*x), ((2, 5), 4*x + 1)] sage: for (i,j),v in entries: M[j,i]=v sage: M [ 0 0 2*x] [ 0 x + 3 0] [ x 0 0] [ 0 0 0] [3*x + 2 0 0] [ 0 3*x 4*x + 1] [ 0 x + 4 0] [ 0 0 0] [ 2*x 0 0] sage: type(M) <type 'sage.matrix.matrix_generic_dense.Matrix_generic_dense'> sage: M._echelon_in_place('classical') (0, 1, 2)
Hence, M._echelon_in_place_classical
behaves differently for Matrix_gfpn_dense and for generic implementation. That indeed needs to be fixed.
comment:27 Changed 3 years ago by
The generic implementation's documentation does not state what return value it gives. However, it apparently returns the pivots. So, I guess I shall modify the _echelon_in_place_classical method of Matrix_gfpn_dense.
comment:28 Changed 3 years ago by
 Commit changed from 4eae803913d67d3f46f58e56d3524c9b4f73b002 to e2f0550696a5dbc9e0cd595d1eda7d9b9048d06d
Branch pushed to git repo; I updated commit sha1. New commits:
e2f0550  Specify that _echelon_in_place shall return the pivots

comment:29 followup: ↓ 39 Changed 3 years ago by
 Status changed from needs_work to needs_review
Done! Should work now, regardless whether meataxe is installed or not. Hopefully someone can review the last commit.
comment:30 Changed 3 years ago by
 sage t src/sage/rings/padics/padic_base_leaves.py # Timed out sage t src/sage/rings/padics/padic_lattice_element.py # Timed out sage t src/sage/tests/cmdline.py # 2 doctests failed sage t src/sage/symbolic/integration/integral.py # 1 doctest failed sage t src/sage/combinat/tutorial.py # 2 doctests failed sage t src/sage/groups/perm_gps/permgroup_named.py # 2 doctests failed sage t src/sage/combinat/words/paths.py # 1 doctest failed sage t src/sage/symbolic/integration/external.py # 3 doctests failed sage t src/sage/combinat/designs/bibd.py # 1 doctest failed sage t src/sage/databases/oeis.py # 34 doctests failed sage t src/sage/coding/databases.py # 2 doctests failed sage t src/doc/en/developer/coding_basics.rst # 1 doctest failed sage t src/sage/combinat/integer_lists/invlex.pyx # 1 doctest failed sage t src/sage/repl/load.py # 1 doctest failed sage t src/sage/databases/findstat.py # 8 doctests failed sage t src/sage/combinat/designs/ext_rep.py # 1 doctest failed sage t src/sage/finance/stock.py # 20 doctests failed sage t src/sage/structure/sage_object.pyx # 2 doctests failed  Total time for all tests: 4047.7 seconds cpu time: 8523.9 seconds cumulative wall time: 10809.1 seconds External software detected for doctesting: internet,latex Traceback (most recent call last): File "/home/king/Sage/git/sage/src/bin/sageruntests", line 127, in <module> err = DC.run() File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/control.py", line 1176, in run + ','.join(available_software.seen())) File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/control.py", line 583, in log self.logger.write(s + end) File "/home/king/Sage/git/sage/local/lib/python2.7/sitepackages/sage/doctest/control.py", line 250, in write f.write(x) ValueError: I/O operation on closed file Makefile:132: recipe for target 'ptestall' failed make: *** [ptestall] Error 1
Ouch.
comment:31 Changed 3 years ago by
But these failures seem all to be "optional: internet". Anyway, I got them with make ptestall
.
comment:32 Changed 3 years ago by
PS: I'll try again with a clean develop branch.
comment:33 followups: ↓ 34 ↓ 35 Changed 3 years ago by
Loads of errors, but at least the last few errors I saw are due to what was fixed in this ticket.
Frustrating.
comment:34 in reply to: ↑ 33 Changed 3 years ago by
Replying to SimonKing:
Loads of errors, but at least the last few errors I saw are due to what was fixed in this ticket.
Frustrating.
The "optional  internet" tests are failing, too.
comment:35 in reply to: ↑ 33 ; followup: ↓ 38 Changed 3 years ago by
Replying to SimonKing:
Frustrating.
By "frustrating" I mean the fact that a clean develop branch doesn't pass tests. So, how could one possibly do development without a proper base to start with.
comment:36 Changed 3 years ago by
The internet tests failing are in some sense recent: Sage is supposed to automatically test for an internet connection and then run those tests, but the function has_internet
was incorrectly returning False
until a recent fix (#25222). Now lots of tests are failing, but there are tickets for them (#25474, #25472, and some others). You could turn off your internet while running tests, or just specify testing those tests marked optional with the meataxe
tag.
comment:37 Changed 3 years ago by
You can also run the tests manually with sage b
(with optional=sage,meataxe
, but I think internet
gets excluded from those tests and the default is sufficient).
comment:38 in reply to: ↑ 35 Changed 3 years ago by
Replying to SimonKing:
So, how could one possibly do development without a proper base to start with.
Don't use make ptestall
if you don't want all external tests to be run. Instead, use make ptest
.
comment:39 in reply to: ↑ 29 Changed 3 years ago by
comment:40 Changed 3 years ago by
 Reviewers changed from Simon King to Simon King, Travis Scrimshaw
 Status changed from needs_review to positive_review
LGTM. Thank you.
comment:41 Changed 3 years ago by
 Branch changed from public/linear_algebras/fix_meataxe_echelon25476 to e2f0550696a5dbc9e0cd595d1eda7d9b9048d06d
 Resolution set to fixed
 Status changed from positive_review to closed
Practical problem: I can currently not build and test 8.3.beta3, since I won't interrupt a computation that I am currently running (it is running since three weeks and might disprove a conjecture on the structure of cohomology rings).