Opened 2 years ago

Closed 2 years ago

#25476 closed defect (fixed)

MeatAxe-related bug introduced in 8.3.beta

Reported by: SimonKing Owned by:
Priority: blocker Milestone: sage-8.3
Component: packages: optional Keywords: meataxe rescale_row
Cc: vdelecroix, tscrim Merged in:
Authors: Travis Scrimshaw, Simon King Reviewers: Simon King, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: e2f0550 (Commits) Commit: e2f0550696a5dbc9e0cd595d1eda7d9b9048d06d
Dependencies: Stopgaps:

Description

Vincent reports this for SageMath 8.3.beta3 with MeatAxe installed:

File "src/sage/coding/linear_code.py", line 2039, in
sage.coding.linear_code.AbstractLinearCode.__getitem__
Failed example:
     C = random_matrix(GF(25,'a'), 2, 7).row_space()
Exception raised:
     Traceback (most recent call last):
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/doctest/forker.py",
line 572, in _run
         self.compile_and_execute(example, compiler, test.globs)
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/doctest/forker.py",
line 982, in compile_and_execute
         exec(compiled, globs)
       File "<doctest
sage.coding.linear_code.AbstractLinearCode.__getitem__[4]>", line 1, in
<module>
         C = random_matrix(GF(Integer(25),'a'), Integer(2),
Integer(7)).row_space()
       File "sage/matrix/matrix2.pyx", line 4592, in
sage.matrix.matrix2.Matrix.row_space
(build/cythonized/sage/matrix/matrix2.c:34495)
         return self.row_module(base_ring=base_ring)
       File "sage/matrix/matrix2.pyx", line 4565, in
sage.matrix.matrix2.Matrix.row_module
(build/cythonized/sage/matrix/matrix2.c:34373)
         return M.span(self.rows(), already_echelonized=False)
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/modules/free_module.py",
line 3850, in span
         self.ambient_module(), gens=gens, check=check,
already_echelonized=already_echelonized)
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/modules/free_module.py",
line 6961, in __init__
         echelonize=not already_echelonized,
already_echelonized=already_echelonized)
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/modules/free_module.py",
line 6763, in __init__
         echelonized_basis=echelonized_basis,
already_echelonized=already_echelonized)
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/modules/free_module.py",
line 5691, in __init__
         basis = self._echelonized_basis(ambient, basis)
       File
"/home/vdelecro/sage_patchbot/local/lib/python2.7/site-packages/sage/modules/free_module.py",
line 6894, in _echelonized_basis
         E = A.echelon_form()
       File "sage/matrix/matrix2.pyx", line 6819, in
sage.matrix.matrix2.Matrix.echelon_form
(build/cythonized/sage/matrix/matrix2.c:53810)
         v = E.echelonize(cutoff=cutoff, **kwds)
       File "sage/matrix/matrix2.pyx", line 6719, in
sage.matrix.matrix2.Matrix.echelonize
(build/cythonized/sage/matrix/matrix2.c:53210)
         self._echelon_in_place(algorithm)
       File "sage/matrix/matrix2.pyx", line 6984, in
sage.matrix.matrix2.Matrix._echelon_in_place
(build/cythonized/sage/matrix/matrix2.c:54847)
         A.rescale_row(r, a_inverse, c)
       File "sage/matrix/matrix0.pyx", line 3005, in
sage.matrix.matrix0.Matrix.rescale_row
(build/cythonized/sage/matrix/matrix0.c:21880)
         self.rescale_row_c(i, s, start_col)
       File "sage/matrix/matrix_gfpn_dense.pyx", line 944, in
sage.matrix.matrix_gfpn_dense.Matrix_gfpn_dense.rescale_row_c
(build/cythonized/sage/matrix/matrix_gfpn_dense.c:9339)
         raise ValueError("We can only rescale a full row of a non-empty
matrix")
     ValueError: We can only rescale a full row of a non-empty matrix
**********************************************************************

The above error does not occur in 8.2.rc4. Let's try to prevent it from being published in 8.3!

Change History (41)

comment:1 Changed 2 years ago by SimonKing

Practical problem: I can currently not build and test 8.3.beta3, since I won't interrupt a computation that I am currently running (it is running since three weeks and might disprove a conjecture on the structure of cohomology rings).

comment:2 Changed 2 years ago by tscrim

  • Cc tscrim added

This also occurs in testing matrix/matrix2.pyx.

comment:3 Changed 2 years ago by tscrim

  • Priority changed from critical to blocker

So this seems to be caused by #17272, but I do not understand why. It is the exact same code for the classical echelon algorithm. The only other difference is that has become a cpdef method instead of a def method, but changing that didn't seem to fix the problem. At least, where the error is happening, it means we would have just inverted a 0 if the row is all 0. So perhaps this is a bug in Cython?

Also, failing tests with optional packages is a blocker.

comment:4 Changed 2 years ago by tscrim

PS - I think the machine I tested #17272 on did not have meataxe installed (unlike my usual devbox), but I thought it did...(cannot check it for the next ~2 months).

comment:5 Changed 2 years ago by tscrim

I really have no idea what is going on. It does work before #17272, but fails after. Changing both the cpdef -> def doesn't fix it. The algorithm is exactly the same and even reverting that part of the code does not fix it.

comment:6 Changed 2 years ago by SimonKing

While trying to build the latest beta, cddlib fails. Very bad. So, I still cannot test stuff.

comment:7 Changed 2 years ago by SimonKing

Fortunately retrying make cddlib worked.

comment:8 Changed 2 years ago by SimonKing

Now I completed building the latest beta and can confirm the bug.

comment:9 Changed 2 years ago by SimonKing

PS: It is not the creation of the matrix but calling .row_space().

comment:10 Changed 2 years ago by SimonKing

OK, the problem is that the algorithm has been changed so that only a part of a row was rescaled. That's quite reasonable to do, and actually one of the changes of SharedMeatAxe compared to MeatAxe is that in the school book gaußian elimination implementation I made it so that the already annulated part of a row is not rescaled.

Only in the wrapper in Sage I did not expose that functionality. I guess that needs to change.

comment:11 Changed 2 years ago by tscrim

That part of Gaußian elimination was not changed by #17272.

Ah, I found the problem: the method was renamed _echelon_in_place_classical -> _echelon_in_place. So the generic matrix code is no longer calling the specialized version.

Last edited 2 years ago by tscrim (previous) (diff)

comment:12 Changed 2 years ago by tscrim

I will push a fix in 1 second.

comment:13 Changed 2 years ago by tscrim

  • Branch set to public/linear_algebras/fix_meataxe_echelon-25476
  • Commit set to 4eae803913d67d3f46f58e56d3524c9b4f73b002
  • Status changed from new to needs_review

Okay, so this fixes the problem. It has a bit of technical debt as it pushes a potentially bigger issue design issue down the road, but it works. I also do similar fixes for other matrices that may not be using their optimized code. A bigger design discussion on matrices is likely needed.


New commits:

4eae803Making matrices use the new _echelon_in_place method.

comment:14 Changed 2 years ago by SimonKing

So, you think it won't be needed to allow !Matrix_gfpn_dense to perform a partial row rescaling? Would it nonetheless be a good idea to implement?

Last edited 2 years ago by SimonKing (previous) (diff)

comment:15 follow-up: Changed 2 years ago by tscrim

I definitely think it would be good to do partial row rescaling; it probably will give you a speed boost. Although that this actually indicated an error turned out to be a good thing indicating something I missed when reviewing #17272.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 2 years ago by SimonKing

Replying to tscrim:

I definitely think it would be good to do partial row rescaling; it probably will give you a speed boost. Although that this actually indicated an error turned out to be a good thing indicating something I missed when reviewing #17272.

Then I propose to provide an enhancement on top of your changes. I just checked: With your branch the tests in src/sage/matrix pass. And then I hope we can review each other's commits.

comment:17 in reply to: ↑ 16 Changed 2 years ago by tscrim

Replying to SimonKing:

Replying to tscrim:

I definitely think it would be good to do partial row rescaling; it probably will give you a speed boost. Although that this actually indicated an error turned out to be a good thing indicating something I missed when reviewing #17272.

Then I propose to provide an enhancement on top of your changes. I just checked: With your branch the tests in src/sage/matrix pass. And then I hope we can review each other's commits.

In principle, I would fine with it. However, given this is a blocker issue, I think this would be better to do so on a follow-up (which I will happily review).

comment:18 follow-up: Changed 2 years ago by SimonKing

So far I got a timeout in sage -t src/sage/rings/padics/padic_lattice_element.py.

comment:19 in reply to: ↑ 18 Changed 2 years ago by tscrim

Replying to SimonKing:

So far I got a timeout in sage -t src/sage/rings/padics/padic_lattice_element.py.

That has been noticed by a few other people.

comment:20 Changed 2 years ago by SimonKing

sage -t src/sage/rings/padics/padic_lattice_element.py and sage -t src/sage/rings/padics/padic_base_leaves.py time out consistently, even when tested separately.

I cannot imagine that it is related with the issue we are dealing with here - hence, I'll try if these two tests fail without the patch from here.

comment:21 Changed 2 years ago by SimonKing

Yep, I get the same timeouts. So, I believe the timeouts are not related with this ticket and thus shouldn't prevent a review.

The code looks good. But shouldn't there be a test that demonstrates that the bug is fixed? Or are the existing tests good enough?

comment:22 Changed 2 years ago by jhpalmieri

Tests pass for me on OS X with this branch, on a system with MeatAxe installed. I haven't tried without MeatAxe present.

comment:23 follow-up: Changed 2 years ago by tscrim

I would say the existing tests are good enough since they were breaking before this change.

comment:24 in reply to: ↑ 23 Changed 2 years ago by SimonKing

  • Authors set to Travis Scrimshaw
  • Reviewers set to Simon King
  • Status changed from needs_review to positive_review

Replying to tscrim:

I would say the existing tests are good enough since they were breaking before this change.

In this case I can give a positive review.

comment:25 follow-up: Changed 2 years ago by vbraun

  • Status changed from positive_review to needs_work

See patchbot

comment:26 in reply to: ↑ 25 Changed 2 years ago by SimonKing

Replying to vbraun:

See patchbot

The patchbot complains that there is a non-ascii character. But I cannot identify it. So, what is it?

The patchbot complains about pyflakes. But the error is "NameError?: name 'isPythonFile' is not defined". Doesn't that mean there is a bug in the plugin, not in the changeset?

The patchbot complains about startup time. I don't know how to interpret the numbers, so I don't know how much of a problem we have here.

The test error is reproducible as follows:

sage: K.<x> = GF(25)
sage: M = MatrixSpace(K,9,3,implementation='generic')()
sage: entries = [((0, 2), x), ((0, 4), 3*x + 2), ((0, 8), 2*x), ((1, 1), x + 3), ((1, 5), 3*x), ((1, 6), x + 4), ((2, 0), 2*x), ((2, 5), 4*x + 1)]
sage: for (i,j),v in entries: M[j,i]=v
sage: M
[      0       0     2*x]
[      0   x + 3       0]
[      x       0       0]
[      0       0       0]
[3*x + 2       0       0]
[      0     3*x 4*x + 1]
[      0   x + 4       0]
[      0       0       0]
[    2*x       0       0]
sage: type(M)
<type 'sage.matrix.matrix_generic_dense.Matrix_generic_dense'>
sage: M._echelon_in_place('classical')
(0, 1, 2)

Hence, M._echelon_in_place_classical behaves differently for Matrix_gfpn_dense and for generic implementation. That indeed needs to be fixed.

comment:27 Changed 2 years ago by SimonKing

The generic implementation's documentation does not state what return value it gives. However, it apparently returns the pivots. So, I guess I shall modify the _echelon_in_place_classical method of Matrix_gfpn_dense.

comment:28 Changed 2 years ago by git

  • Commit changed from 4eae803913d67d3f46f58e56d3524c9b4f73b002 to e2f0550696a5dbc9e0cd595d1eda7d9b9048d06d

Branch pushed to git repo; I updated commit sha1. New commits:

e2f0550Specify that _echelon_in_place shall return the pivots

comment:29 follow-up: Changed 2 years ago by SimonKing

  • Authors changed from Travis Scrimshaw to Travis Scrimshaw, Simon King
  • Status changed from needs_work to needs_review

Done! Should work now, regardless whether meataxe is installed or not. Hopefully someone can review the last commit.

comment:30 Changed 2 years ago by SimonKing

----------------------------------------------------------------------
sage -t src/sage/rings/padics/padic_base_leaves.py  # Timed out
sage -t src/sage/rings/padics/padic_lattice_element.py  # Timed out
sage -t src/sage/tests/cmdline.py  # 2 doctests failed
sage -t src/sage/symbolic/integration/integral.py  # 1 doctest failed
sage -t src/sage/combinat/tutorial.py  # 2 doctests failed
sage -t src/sage/groups/perm_gps/permgroup_named.py  # 2 doctests failed
sage -t src/sage/combinat/words/paths.py  # 1 doctest failed
sage -t src/sage/symbolic/integration/external.py  # 3 doctests failed
sage -t src/sage/combinat/designs/bibd.py  # 1 doctest failed
sage -t src/sage/databases/oeis.py  # 34 doctests failed
sage -t src/sage/coding/databases.py  # 2 doctests failed
sage -t src/doc/en/developer/coding_basics.rst  # 1 doctest failed
sage -t src/sage/combinat/integer_lists/invlex.pyx  # 1 doctest failed
sage -t src/sage/repl/load.py  # 1 doctest failed
sage -t src/sage/databases/findstat.py  # 8 doctests failed
sage -t src/sage/combinat/designs/ext_rep.py  # 1 doctest failed
sage -t src/sage/finance/stock.py  # 20 doctests failed
sage -t src/sage/structure/sage_object.pyx  # 2 doctests failed
----------------------------------------------------------------------
Total time for all tests: 4047.7 seconds
    cpu time: 8523.9 seconds
    cumulative wall time: 10809.1 seconds
External software detected for doctesting: internet,latex
Traceback (most recent call last):
  File "/home/king/Sage/git/sage/src/bin/sage-runtests", line 127, in <module>
    err = DC.run()
  File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/control.py", line 1176, in run
    + ','.join(available_software.seen()))
  File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/control.py", line 583, in log
    self.logger.write(s + end)
  File "/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/doctest/control.py", line 250, in write
    f.write(x)
ValueError: I/O operation on closed file
Makefile:132: recipe for target 'ptestall' failed
make: *** [ptestall] Error 1

Ouch.

comment:31 Changed 2 years ago by SimonKing

But these failures seem all to be "optional: internet". Anyway, I got them with make ptestall.

comment:32 Changed 2 years ago by SimonKing

PS: I'll try again with a clean develop branch.

comment:33 follow-ups: Changed 2 years ago by SimonKing

Loads of errors, but at least the last few errors I saw are due to what was fixed in this ticket.

Frustrating.

comment:34 in reply to: ↑ 33 Changed 2 years ago by SimonKing

Replying to SimonKing:

Loads of errors, but at least the last few errors I saw are due to what was fixed in this ticket.

Frustrating.

The "optional - internet" tests are failing, too.

comment:35 in reply to: ↑ 33 ; follow-up: Changed 2 years ago by SimonKing

Replying to SimonKing:

Frustrating.

By "frustrating" I mean the fact that a clean develop branch doesn't pass tests. So, how could one possibly do development without a proper base to start with.

comment:36 Changed 2 years ago by jhpalmieri

The internet tests failing are in some sense recent: Sage is supposed to automatically test for an internet connection and then run those tests, but the function has_internet was incorrectly returning False until a recent fix (#25222). Now lots of tests are failing, but there are tickets for them (#25474, #25472, and some others). You could turn off your internet while running tests, or just specify testing those tests marked optional with the meataxe tag.

comment:37 Changed 2 years ago by tscrim

You can also run the tests manually with sage -b (with --optional=sage,meataxe, but I think internet gets excluded from those tests and the default is sufficient).

comment:38 in reply to: ↑ 35 Changed 2 years ago by jdemeyer

Replying to SimonKing:

So, how could one possibly do development without a proper base to start with.

Don't use make ptestall if you don't want all external tests to be run. Instead, use make ptest.

comment:39 in reply to: ↑ 29 Changed 2 years ago by jdemeyer

Replying to SimonKing:

Hopefully someone can review the last commit.

Travis, can you do that? This is making it very hard to work on any matrix-related ticket (like #23719, #25079 and #25504)

comment:40 Changed 2 years ago by tscrim

  • Reviewers changed from Simon King to Simon King, Travis Scrimshaw
  • Status changed from needs_review to positive_review

LGTM. Thank you.

comment:41 Changed 2 years ago by vbraun

  • Branch changed from public/linear_algebras/fix_meataxe_echelon-25476 to e2f0550696a5dbc9e0cd595d1eda7d9b9048d06d
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.