Opened 8 years ago
Closed 8 years ago
#13878 closed defect (fixed)
Fix failing assertion in linbox/matrix/permutation-matrix.h
Reported by: | SimonKing | Owned by: | jason, was |
---|---|---|---|
Priority: | major | Milestone: | sage-5.7 |
Component: | linear algebra | Keywords: | debug, linbox, assertion |
Cc: | nbruin, vbraun, jpflori, malb | Merged in: | sage-5.7.beta4 |
Authors: | Nils Bruin | Reviewers: | Simon King |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
With the Sage debug version of #13864, one gets the following crash, as reported by Nils:
sage: sage.homology.chain_complex.HomologyGroup(100) ERROR (at getWritePointer in /home/simon/SAGE/debug/sage-5.6.beta1/local/include/linbox/matrix/permutation-matrix.h:175): Precondition not met:P_.size() terminate called after throwing an instance of 'LinBox::PreconditionFailed' Program received signal SIGABRT, Aborted. 0x00007ffff6d95d95 in raise () from /lib64/libc.so.6
APPLY
Attachments (2)
Change History (14)
comment:1 Changed 8 years ago by
comment:2 Changed 8 years ago by
It looks like trivial spans in large ZZ-modules are to blame:
invfac=[0]*75 n = len(invfac) A = ZZ**n B = A.span([A.gen(i) * invfac[i] for i in xrange(n)])
runs OK, but with invfac=[0]*76
we get the abort. With invfac=[1]+[0]*100
or invfac=[0]*100+[1]
we don't get an error. Perhaps this helps to find the problematic code path?
comment:3 Changed 8 years ago by
I have commented on sage-devel, this is totally a DNDEBUG problem. We got slammed by it in sage-on-gentoo when linbox was updated since our python doesn't add DNDEBUG either. See https://github.com/cschwan/sage-on-gentoo/issues/166.
comment:4 Changed 8 years ago by
Offending code is this:
cdef inline linbox_echelonize_efd(celement modulus, celement* entries, Py_ssize_t nrows, Py_ssize_t ncols): cdef ModField *F = new ModField(<long>modulus) cdef EchelonFormDomain *EF = new EchelonFormDomain(F[0]) cdef BlasMatrix *A = new BlasMatrix(F[0], <uint64_t>nrows, <uint64_t>ncols) cdef BlasMatrix *E = new BlasMatrix(F[0], <uint64_t>nrows, <uint64_t>ncols) cdef Py_ssize_t i,j # TODO: can we avoid this copy? for i in range(nrows): for j in range(ncols): A.setEntry(i, j, <ModFieldElement>entries[i*ncols+j]) cdef int r = EF.rowReducedEchelon(E[0], A[0]) for i in range(nrows): for j in range(ncols): entries[i*ncols+j] = <celement>E.getEntry(i,j) cdef Py_ssize_t ii = 0 cdef list pivots = [] for i in range(r): for j in range(ii,ncols): if entries[i*ncols+j] == 1: pivots.append(j) ii = j+1 break del F, A, E, EF return r, pivots
This gets called with nrows=0
, so A
does not get initialized with anything (there's noting to set!). I suppose linbox should handle such an edge case, but I haven't checked their doc to see if they exclude it as valid. Obviously, it's easy to special case this degeneracy: If nrows=0
then we should return 0,[]. It's probably worthwhile to fix this, but I'm pretty sure this is not a source for bugs otherwise.
comment:5 Changed 8 years ago by
preliminary patch attached. It does make the annoying SIGABRTs go away, so it allows testing for other issues. I'm not particularly promoting this patch for general use (it'll be a bit of a speed penalty and I'm not so sure it's required for proper functioning under normal conditions)
comment:6 Changed 8 years ago by
I think it would be interesting to be able to use linbox without having to add DNDEBUG, it is a good thing in my book. Your fix seem minimal to me. Since sage-on-gentoo do not compile with DNDEBUG will I be able to open bugs each time I get a signal from linbox if we go that route :)
comment:7 follow-up: ↓ 8 Changed 8 years ago by
Hooray! With you preliminary patch and MALLOC_CHECK_=3, I get
sage: sage.homology.chain_complex.HomologyGroup(100) Z^100
What is needed to do to finalise your preliminary patch? Adding the above example as a doc test would be half cheating, because this used only to fail in debug mode. But still, one could include this example and comment on it accordingly.
comment:8 in reply to: ↑ 7 Changed 8 years ago by
Replying to SimonKing:
What is needed to do to finalise your preliminary patch? Adding the above example as a doc test would be half cheating, because this used only to fail in debug mode. But still, one could include this example and comment on it accordingly.
At first blush it seems this is an upstream bug, so this would only be a workaround, not a proper fix. Furthermore, it seems this code seems to work in an NDEBUG linbox, so perhaps the test is just overzealous. I haven't checked linbox's documentation to see if they declare this input illegal.
Other than that, there already are doctests that test this issue, so a comment should be enough documentation for this fix.
Changed 8 years ago by
comment:9 Changed 8 years ago by
- Description modified (diff)
- Status changed from new to needs_review
comment:10 Changed 8 years ago by
- Reviewers set to Simon King
- Status changed from needs_review to positive_review
It might ultimately be an upstream bug. But I think fixing it on our end is fine. I suggest to get this in, so that Sage finally has a working debug version. Positive review. I documented the change in a comment, in my reviewer patch.
comment:11 Changed 8 years ago by
- Description modified (diff)
comment:12 Changed 8 years ago by
- Merged in set to sage-5.7.beta4
- Resolution set to fixed
- Status changed from positive_review to closed
The gdb backtrace is as follows:
Question: Is that an upstream bug or a bug in the Sage library?