Ticket #11413 (needs_review enhancement)

Opened 2 years ago

Last modified 14 months ago

Decoding general linear codes with Groebner bases

Reported by: Niels Owned by: T.b.a.
Priority: minor Milestone: sage-5.11
Component: coding theory Keywords: general decoding groebner basis
Cc: sbulygin, klee Work issues:
Report Upstream: N/A Reviewers: Burcin Erocal
Authors: Niels Duif Merged in:
Dependencies: Stopgaps:

Description (last modified by Niels) (diff)

I have implemented a decoding method for general linear codes in Sage. The method decodes up to half the true minimum distance using Groebner bases. This method was introduced by Bulygin and Pellikaan.

I was expecting the method to be faster than syndrome decoding, but it appears to be equally fast. It may be worth having this method around in Sage since Groebner basis computation may become faster. I have attached a report with my findings.

Apply:

Attachments

report.pdf Download (159.8 KB) - added by Niels 2 years ago.
Decoding with Groebner bases
decoding_GB.patch Download (4.9 KB) - added by Niels 2 years ago.
decoding_GB_2.patch Download (8.1 KB) - added by Niels 2 years ago.

Change History

Changed 2 years ago by Niels

Decoding with Groebner bases

comment:1 Changed 2 years ago by Niels

  • Status changed from new to needs_review

Here is my implementation. Please review!

Changed 2 years ago by Niels

comment:2 Changed 2 years ago by burcin

  • Cc sbulygin added; Stanislav.Bulygin@… removed
  • Reviewers set to Burcin Erocal
  • Status changed from needs_review to needs_work

Thanks for the patch Niels. A little more work is needed to get this merged. Here are a few observations based on reading your code without any familiarity with the algorithm:

  • You can't have a comment after the output of a doctest. As the patchbot also noticed (click on the yellow dot at the issue header), this test fails.
  • The function decode_BP() doesn't have any doctests. You can find more information about how to write tests in the developer manual:  http://sagemath.org/doc/developer/conventions.html#automated-testing
  • It would be good to include a reference to the paper in the docstring for decode().
  • I don't see the advantage of assigning single variable names to C.length(), C.dimension(). This just makes the code unreadable since you have to look up what these variables mean.
  • There must be a more efficient way to create the Reed-Solomon matrix. You could at least store each g^i and compute (g^i)^j in the inner loop.
  • which brings me to the question, did you profile this function at all? It is entirely possible that GB computation is not the bottleneck and more time is spent elsewhere.
  • the var() function creates symbolic variables. You should create the polynomial ring up front and use the generators of that for all the computations. This would eliminate conversion between symbolic objects and Singular polynomials and make arithmetic faster.
  • the first if statement in linear_code.py should be changed to if algorithm in ["syndrome", ...]. It looks like the second if statement is really an else. The error message should be changed to include all implemented algorithms.

comment:3 Changed 2 years ago by Niels

  • Status changed from needs_work to needs_review

Thanks for reviewing my patch, Bursin! Here is what I have done with your comments:

  • You can't have a comment after the output of a doctest. As the patchbot also noticed (click on the yellow dot at the issue header), this test fails.
    • Thanks, I fixed this.
  • The function decode_BP() doesn't have any doctests. You can find more information about how to write tests in the developer manual:  http://sagemath.org/doc/developer/conventions.html#automated-testing
    • Thanks, I included some doctests and tested the files decoder.py and linear_code.py. All tests now pass.
  • It would be good to include a reference to the paper in the docstring for decode().
    • Thanks, I included a reference to the paper. Can I expect the url of this ticket to persist? I can host the paper on my own website instead, so it will surely persist.
  • I don't see the advantage of assigning single variable names to C.length(), C.dimension(). This just makes the code unreadable since you have to look up what these variables mean.
    • Ok, I changed this. For me the short names are more readable, but I guess it depends on what you're used to.
  • There must be a more efficient way to create the Reed-Solomon matrix. You could at least store each g^i and compute (g^i)^j in the inner loop.
    • Thanks for pointing this out; the new version should be more efficient.
  • which brings me to the question, did you profile this function at all? It is entirely possible that GB computation is not the bottleneck and more time is spent elsewhere.
    • Well, sort of. I ran the Sage code in chunks, using different inputs. The GB computation usually takes multiple seconds for large dimension, while the rest of the code only takes a split second. I also tried %prun, but I don't understand the result. Apparently there are >2k calls to the method ___iter___ in linear_code.py, but I don't understand why. This is actually what takes the most time.
  • the var() function creates symbolic variables. You should create the polynomial ring up front and use the generators of that for all the computations. This would eliminate conversion between symbolic objects and Singular polynomials and make arithmetic faster.
    • I changed this. The point is that the ring R now has more variables than necessary. Some of the V_i will not appear in any equation, unless the code is an MDS code and the number of errors is maximal. I hope this does not slow down the GB computation.
  • the first if statement in linear_code.py should be changed to if algorithm in ["syndrome", ...]. It looks like the second if statement is really an else. The error message should be changed to include all implemented algorithms.
    • You're right; this looks ugly. The original code was written in this way, and I didn't bother changing it at first. It is fixed now.

The changes are in patch number 2. So please apply patch 1 first, then apply patch 2.

Changed 2 years ago by Niels

comment:4 Changed 2 years ago by Niels

  • Description modified (diff)

comment:5 Changed 14 months ago by klee

  • Cc klee added
Note: See TracTickets for help on using tickets.