#20705 closed enhancement (fixed)
Classes for Reed Muller Codes
Reported by:  Parthasarathi Panda  Owned by:  

Priority:  major  Milestone:  sage7.3 
Component:  coding theory  Keywords:  
Cc:  David Lucas, Johan Rosenkilde, Julien Lavauzelle  Merged in:  
Authors:  Parthasarathi Panda  Reviewers:  David Lucas, Johan Sebastian Rosenkilde Nielsen 
Report Upstream:  N/A  Work issues:  
Branch:  3cce07f (Commits, GitHub, GitLab)  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
This ticket proposes a implementation of Reed Muller Codes. It contains:
two new code classes, QAryReedMullerCode and BinaryReedMullerCode?, which implements the two classes of reed muller codes two encoder classes, ReedMullerVectorEncoder? and ReedMullerPolynomialEncoder? which are used by both the code classes some additional functions to assist in computations related to the polynomials.
NOTE: Both the classes are implemented separately since they would have different decoders
I used the following code snippets to test them,
#for q>2 code=ReedMullerCode(3, 2, 2) print code.dimension() E1=ReedMullerVectorEncoder(code) E2=ReedMullerPolynomialEncoder(code) R=PolynomialRing(code.base_field(),code.numberOfVariable,"x") x0=R.gen(0) x1=R.gen(1) c1=E1.encode(vector(GF(3),[1,1,1,1,1,1])) print c1 c2=E2.encode(1+x0+x1+x1^2+x1*x0) print c2 D=LinearCodeSyndromeDecoder(code) c=D.decode_to_code(vector(GF(3),[1, 2, 0, 0, 2, 0, 1, 1, 1])) print c print E2.unencode_nocheck(c) print D.decode_to_message(vector(GF(3),[1,2,1,0,0,2,1,2,2]))
The output of which was,
6 (1, 0, 1, 0, 0, 2, 1, 2, 2) (1, 2, 0, 0, 2, 1, 1, 1, 1) (1, 2, 0, 0, 2, 1, 1, 1, 1) x0*x1 + x1^2 + x0 + x1 + 1 (1, 1, 1, 1, 1, 1)
#for q=2 code=ReedMullerCode(2, 2, 4) print code.dimension() E1=ReedMullerVectorEncoder(code) E2=ReedMullerPolynomialEncoder(code) R=PolynomialRing(code.base_field(),code.numberOfVariable,"x") x0=R.gen(0) x1=R.gen(1) x2=R.gen(2) x3=R.gen(3) c1=E1.encode(vector(GF(2),[1,1,1,1,1,0,0,0,1,0,0])) print c1 c2=E2.encode(1+x0+x1+x2+x3*x2) print c2 D=LinearCodeSyndromeDecoder(code) c=D.decode_to_code(vector(GF(2),[1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0])) print c print E2.unencode_nocheck(c) print D.decode_to_message(vector(GF(2),[0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0]))
This gave the output as:
11 (1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0) (1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1) (1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1) x2*x3 + x0 + x1 + x2 + 1 (1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0)
Change History (56)
comment:1 Changed 7 years ago by
Branch:  → u/panda314/classes_for_reed_muller_codes 

comment:2 Changed 7 years ago by
Commit:  → 9fb9b697ceb1e21ee450809a3447695d46f075bb 

Description:  modified (diff) 
comment:3 Changed 7 years ago by
Commit:  9fb9b697ceb1e21ee450809a3447695d46f075bb → ba4de76e487f27cd572856735c9b3ff960f25664 

Branch pushed to git repo; I updated commit sha1. New commits:
ba4de76  adding documentation

comment:4 Changed 7 years ago by
Commit:  ba4de76e487f27cd572856735c9b3ff960f25664 → 262e32068c96371b3cc28b4d0c16f2eddecb661d 

Branch pushed to git repo; I updated commit sha1. New commits:
262e320  adding documentation for additional functions

comment:5 Changed 7 years ago by
Cc:  David Lucas added 

comment:6 Changed 7 years ago by
Commit:  262e32068c96371b3cc28b4d0c16f2eddecb661d → c5af6d27cb948eb1d41419622a0640b9fdf16377 

comment:7 Changed 7 years ago by
Commit:  c5af6d27cb948eb1d41419622a0640b9fdf16377 → ca93a4b3ce5ab53febff2cf554bab057ce61ac0f 

Branch pushed to git repo; I updated commit sha1. New commits:
ca93a4b  removing ReedMullerCode() of guava.py from the implementation

comment:8 Changed 7 years ago by
Commit:  ca93a4b3ce5ab53febff2cf554bab057ce61ac0f → 23ff289a86e2d4266512e631f50ee75c6338ccd8 

Branch pushed to git repo; I updated commit sha1. New commits:
23ff289  debugged and ran the doctests

comment:9 Changed 7 years ago by
Status:  new → needs_review 

Hello,
I started reading your code. Before giving actual comments on the code itself, here are some general comments:
 there is certain conventions to respect while writing Python code. You can find a summary of these conventions here.
 lines are supposed to be less than 80 characters long. As I think it's an old rule, I'm completely ok if some lines are around 100 characters, especially in the doctests block which might be hard to break. However, some lines in your code are just too long: e.g. l. 58 is 161 characters long, l.92 is 144 characters long...). A general tip: in Python, as long as you are in the same parenthesis block, you can jump to the next line without any special character. For instance, this:
poly=poly+z*multivariatePolynomialInterpolation([evaluation2[i][k] for i in range(nq)], numberOfVariable1, orderk, q, finiteField, _R)
can be written like this:
poly=poly+z*multivariatePolynomialInterpolation([evaluation2[i][k] for i in range(nq)], numberOfVariable1, orderk, q, finiteField, _R)
 on names: in python, methods and variables names are written like this
function_name
andvariable_name
(and notfunctionName
andvariableName
). Module names should follow the same convention (reed_muller_code.py
instead ofReedMullerCode.py
).
 You wrote the documentation for a few methods like this:
r""" DOC """ def my_func: code
but it should *always* be like this:
def my_func: r""" DOC """ code
 More of the same, this DOC block should always be formatted like this:
def my_func(i1, i2): r""" Short description of my_func return value (Should be something like "Returns ....") [optional] more details INPUT:  ``i1``  description of i1  ``i2``  description of i2 [optional] OUTPUT EXAMPLES:: sage: ... [optional] TESTS:: sage: ...
 you should always put one whitespace before and after operators (exception: precedence), e.g:
a = b + c
a = (b+c) * (bc)
 you used full names for variables everywhere but for
_R
. I think it would be better to rename itpolynomial_ring
or something like that.
I'll run tests this afternoon and probably come back with more comments on the code itself.
Don't worry about all my comments in formatting, it's your first big ticket for Sage, so it's perfectly normal to use wrong coding conventions. It takes some time to get used to it ;)
Best,
David
comment:10 followups: 11 13 Changed 7 years ago by
Status:  needs_review → needs_work 

Disclaimer: the following remarks are only related to the code in itsef. I did not run (yet) extensive tests on border cases and larger cases than the ones covered by doctests.
General remark on exceptions: don't need to write "Incorrect datatype ..." every time, as ValueError
will be printed on the screen and it means there is something wrong with the input.
Another remark on documentation: you have to add this module to src/doc/en/reference/coding/index.rst
to make it appear in the online documentation. For now, this module is never considered when compiling documentation  which means, even if you run sage docbuild reference/coding
, you won't be able to know if there are some bugs in your doc.
binomial_sum
I think it might be better to make it a private method (which can be done by calling it _binomial_sum).
You could have used already implemented methods, ending up with something like:
return sum(binomial(n, i) for i in range(k+1))
or for i in range(k+1): s += binomial(n, i)
BUT it's much slower than you implementation (up to a factor of 10 according to timeit
). So, we should just keep what you did :)
multivariate_polynomial_interpolation
I'd make it a private method as it's only used internally as a helper.
I would also change the name _R
to polynomial_ring
. It's probably something
you copied from my poor naming choice in grs.py... I changed it in #20744 in the meanwhile ;)
I'd also rename some variables: evaluation2
might be multipoint_evaluation_list
or something like that. I would also change the name nq
, but I don't have any idea here. I would also rename finiteField
as base_field
for consistency.
You should be careful with 0
s and 1
s: the 0
l. 87 is not the same as the one on line 89: the former is the zero of the base field while the latter is the zero of the multivariate polynomial ring where poly
lives.
I recommend using specific variables for these zeroes/ones, which you can define as follows: base_field_zero = base_field.zero()
.
 you need to add
::
in the end of your second example's descriptive sentence.
General remark on classes
 Most of your class parameters are public variables that one has to call directly (e.g.
C.numberOfVariable
) to access. I would change this: make all these variables private (prefix them by an underscore) and write getter methods.
One should call C.number_of_variables()
to the number of variables used.
QAry/BinaryReedMullerCode
 You could add a method to compute the minimum distance, as we know a closed formula to compute it, it's quite a shame to call the exhaustive search algortihm if one types
My_rm_code.minimum_distance()
.
If q
is 2, D = 2^(md)
, otherwise it's q^m  d*q^(m1)
, with m
the number of variables and d
the order.
QAryReedMullerCode
 I would add a
WARNING::
block which states that the order has to be less thanq
to be sure users know about it. Of course, it will be removed later on, but the more informative the better.
Vectorial encoder
 I noticed you compute the generator matrix at construction time. I strongly disagree with this behaviour: as the generator matrix computation can be heavy, I advise you to only compute it when the needed (i.e. in the method
generator_matrix
). The methodgenerator_matrix
should also be cached (use the decorator@cached_method
).
 On the generator matrix computation. I think you should use iterators when possible. Especially, the lines
baseFieldTuple=Tuples(baseField.list(),numberOfVariable) [...] for x in baseFieldTuple.list()
will kill you. For example, over GF(31)
, with only 4 variables, it takes more than 3 minutes to compute the list of all tuples on my laptop (16 GB RAM).
Using an iterator will not compute the full list of tuples, which saves time and memory for larger codes.
Same for exponents
: over GF(23), 4 variables and dimension 6, it takes circa 30 seconds to execute this on my laptop:
exponents=Subsets(range(numberOfVariable)*(q1), submultiset=True)[0:code.dimension()]
Best,
David
comment:11 Changed 7 years ago by
Thanks. I have started implementing some of the changes you have suggested. Regarding formatting, is there any format code or something that i can plug in a ide interface and automatically implement them? It will be tedious to go through the code and manually do them.
Regarding private functions. I had not added examples for them because doctest would not be able to execute them since i had not listed them anywhere. Will writing them as _<private functions> solve that? Replying to dlucas:
Disclaimer: the following remarks are only related to the code in itsef. I did not run (yet) extensive tests on border cases and larger cases than the ones covered by doctests.
General remark on exceptions: don't need to write "Incorrect datatype ..." every time, as
ValueError
will be printed on the screen and it means there is something wrong with the input.Another remark on documentation: you have to add this module to
src/doc/en/reference/coding/index.rst
to make it appear in the online documentation. For now, this module is never considered when compiling documentation  which means, even if you runsage docbuild reference/coding
, you won't be able to know if there are some bugs in your doc.binomial_sum
I think it might be better to make it a private method (which can be done by calling it _binomial_sum).
You could have used already implemented methods, ending up with something like:
return sum(binomial(n, i) for i in range(k+1))
orfor i in range(k+1): s += binomial(n, i)
BUT it's much slower than you implementation (up to a factor of 10 according to
timeit
). So, we should just keep what you did :)multivariate_polynomial_interpolation
I'd make it a private method as it's only used internally as a helper. I would also change the name
_R
topolynomial_ring
. It's probably something you copied from my poor naming choice in grs.py... I changed it in #20744 in the meanwhile ;)I'd also rename some variables:
evaluation2
might bemultipoint_evaluation_list
or something like that. I would also change the namenq
, but I don't have any idea here. I would also renamefiniteField
asbase_field
for consistency. You should be careful with0
s and1
s: the0
l. 87 is not the same as the one on line 89: the former is the zero of the base field while the latter is the zero of the multivariate polynomial ring wherepoly
lives. I recommend using specific variables for these zeroes/ones, which you can define as follows:base_field_zero = base_field.zero()
.
 you need to add
::
in the end of your second example's descriptive sentence.General remark on classes
 Most of your class parameters are public variables that one has to call directly (e.g.
C.numberOfVariable
) to access. I would change this: make all these variables private (prefix them by an underscore) and write getter methods.One should call
C.number_of_variables()
to the number of variables used.QAry/BinaryReedMullerCode
 You could add a method to compute the minimum distance, as we know a closed formula to compute it, it's quite a shame to call the exhaustive search algortihm if one types
My_rm_code.minimum_distance()
.If
q
is 2,D = 2^(md)
, otherwise it'sq^m  d*q^(m1)
, withm
the number of variables andd
the order. QAryReedMullerCode
 I would add a
WARNING::
block which states that the order has to be less thanq
to be sure users know about it. Of course, it will be removed later on, but the more informative the better.Vectorial encoder
 I noticed you compute the generator matrix at construction time. I strongly disagree with this behaviour: as the generator matrix computation can be heavy, I advise you to only compute it when the needed (i.e. in the method
generator_matrix
). The methodgenerator_matrix
should also be cached (use the decorator@cached_method
).
 On the generator matrix computation. I think you should use iterators when possible. Especially, the lines
baseFieldTuple=Tuples(baseField.list(),numberOfVariable) [...] for x in baseFieldTuple.list()will kill you. For example, over
GF(31)
, with only 4 variables, it takes more than 3 minutes to compute the list of all tuples on my laptop (16 GB RAM). Using an iterator will not compute the full list of tuples, which saves time and memory for larger codes.Same for
exponents
: over GF(23), 4 variables and dimension 6, it takes circa 30 seconds to execute this on my laptop:exponents=Subsets(range(numberOfVariable)*(q1), submultiset=True)[0:code.dimension()]Best,
David
comment:12 Changed 7 years ago by
Thanks. I have started implementing some of the changes you have suggested.
Cool!
Regarding formatting, is there any format code or something that i can plug in a ide interface and automatically implement them? It will be tedious to go through the code and manually do them.
Not that I know, I'm afraid. I'm not using any IDE but vim, so I can't be sure though.
But if it's just to change MyVariable
to my_variable
, any text editor has a search and replace feature which will do the job. Of course, you will have to feed it with the target name everytime...
Regarding private functions. I had not added examples for them because doctest would not be able to execute them since i had not listed them anywhere. Will writing them as _<private functions> solve that?
Actually, doctests will be executed. Making these methods private with _name
will make them disappear from the online doc, but the doctesting framework will run them as well. See _punctured_form
in grs.py
for an example.
Best,
David
comment:13 Changed 7 years ago by
Hey, so correct me if i am wrong, [x for x in base_field_tuple] is implicitly implemented using lists. Also check if
exponent=exponents.first() for i in range(dimension):
matrix_list.append([reduce(mul, [x[i] for i in exponent],1) for x in base_field_tuple]) exponent=exponents.next(exponent)
is thais what you meant. I believe the above implementation skips all the unecessary iterations through exponents and elements of the base_field_tuple
Replying to dlucas:
Disclaimer: the following remarks are only related to the code in itsef. I did not run (yet) extensive tests on border cases and larger cases than the ones covered by doctests.
General remark on exceptions: don't need to write "Incorrect datatype ..." every time, as
ValueError
will be printed on the screen and it means there is something wrong with the input.Another remark on documentation: you have to add this module to
src/doc/en/reference/coding/index.rst
to make it appear in the online documentation. For now, this module is never considered when compiling documentation  which means, even if you runsage docbuild reference/coding
, you won't be able to know if there are some bugs in your doc.binomial_sum
I think it might be better to make it a private method (which can be done by calling it _binomial_sum).
You could have used already implemented methods, ending up with something like:
return sum(binomial(n, i) for i in range(k+1))
orfor i in range(k+1): s += binomial(n, i)
BUT it's much slower than you implementation (up to a factor of 10 according to
timeit
). So, we should just keep what you did :)multivariate_polynomial_interpolation
I'd make it a private method as it's only used internally as a helper. I would also change the name
_R
topolynomial_ring
. It's probably something you copied from my poor naming choice in grs.py... I changed it in #20744 in the meanwhile ;)I'd also rename some variables:
evaluation2
might bemultipoint_evaluation_list
or something like that. I would also change the namenq
, but I don't have any idea here. I would also renamefiniteField
asbase_field
for consistency. You should be careful with0
s and1
s: the0
l. 87 is not the same as the one on line 89: the former is the zero of the base field while the latter is the zero of the multivariate polynomial ring wherepoly
lives. I recommend using specific variables for these zeroes/ones, which you can define as follows:base_field_zero = base_field.zero()
.
 you need to add
::
in the end of your second example's descriptive sentence.General remark on classes
 Most of your class parameters are public variables that one has to call directly (e.g.
C.numberOfVariable
) to access. I would change this: make all these variables private (prefix them by an underscore) and write getter methods.One should call
C.number_of_variables()
to the number of variables used.QAry/BinaryReedMullerCode
 You could add a method to compute the minimum distance, as we know a closed formula to compute it, it's quite a shame to call the exhaustive search algortihm if one types
My_rm_code.minimum_distance()
.If
q
is 2,D = 2^(md)
, otherwise it'sq^m  d*q^(m1)
, withm
the number of variables andd
the order. QAryReedMullerCode
 I would add a
WARNING::
block which states that the order has to be less thanq
to be sure users know about it. Of course, it will be removed later on, but the more informative the better.Vectorial encoder
 I noticed you compute the generator matrix at construction time. I strongly disagree with this behaviour: as the generator matrix computation can be heavy, I advise you to only compute it when the needed (i.e. in the method
generator_matrix
). The methodgenerator_matrix
should also be cached (use the decorator@cached_method
).
 On the generator matrix computation. I think you should use iterators when possible. Especially, the lines
baseFieldTuple=Tuples(baseField.list(),numberOfVariable) [... ...] for x in baseFieldTuple.list()will kill you. For example, over
GF(31)
, with only 4 variables, it takes more than 3 minutes to compute the list of all tuples on my laptop (16 GB RAM). Using an iterator will not compute the full list of tuples, which saves time and memory for larger codes.Same for
exponents
: over GF(23), 4 variables and dimension 6, it takes circa 30 seconds to execute this on my laptop:exponents=Subsets(range(numberOfVariable)*(q1), submultiset=True)[0:code.dimension()]Best,
David
comment:14 Changed 7 years ago by
Commit:  23ff289a86e2d4266512e631f50ee75c6338ccd8 → 5129c915c146098f2c9c03ce89ca992b32e86609 

Branch pushed to git repo; I updated commit sha1. New commits:
5129c91  reformatting the file according to standards and rewriting the code to optimise computation time

comment:15 Changed 7 years ago by
Hey, so correct me if i am wrong,
[x for x in base_field_tuple]
is implicitly implemented using iterator. If that's so i think the following implementation should get rid all the redundant iterations through the sets.
Yes, absolutely.
I'm reading your modified code, I'll come back with more comments later on.
comment:16 followup: 17 Changed 7 years ago by
Hello,
Some more comments:
 I noticed you add both
ReedMullerCode
,BinaryReedMullerCode
andQAryReedMullerCode
tocodes_catalog.py
. But with our design, onlyReedMullerCode
should be in this file, as the user should never access the classes themselves.
_multivariate_polynomial_interpolation
 You left a reference to
_R
in the docstring while it no longer exists  Is the base field (and the cardinality) really needed? I would rather remove them from the list of inputs and recover them from
evaluation
. This way, one has to provideevaluation
as a vector over a finite field, you get this finite field by doingF = evaluation.base_ring()
, you check thatF
is the base ring ofpolynomial_ring
and that's it. You can getq
with the appropriate call onF
.  As my previous comment implies, I would only allow
evaluation
to be a vector (and not a list). After all, it's a private method you only use to recover the original message as a polynomial from a codeword, and the latter is necessarily a vector, isn't it?  l. 105: is there a way to avoid a call to
base_field.list()
? I know that in most cases, this list will be rather small, but it's not really efficient to build a full list (especially in a recursive pattern, because you will build it a lot of times...). This time again, I'd suggest an iterator over the finite field.
ReedMullerCode
 the
WARNING
block should be added in the module documentation as well. The syntax on this block is the following.. WARNING:: body of the block
Seelinear_code.py
, line 682 for an example. I would also change the message. I proposeFor now, this implementation only supports ReedMuller codes whose order is less than q
. Such a message implies it will be extended later, I think it's more positive this way ;)
Qary and Binary classes
 I think it's not necessary to save
q
as a class parameter. The user can access it by typingC.base_field().cardinality()
, it's not something specific to the class (or something hard to "compute"), so it should not be saved.
 I prefer full names for useraccessed methods. I'm okay with
num_var
as an internal variable, but its getter should be callednumber_of_variables
instead.
Vectorial encoder
 In
generator_matrix
, you're callingself.code()
a lot of times. I think you should store it in a local variable which will reduce the amount of calls to this method.
Best,
David
comment:17 followup: 19 Changed 7 years ago by
Replying to dlucas: Hi,
So should we keep BinaryReedMullerCode? in codes_catalog.py in keeping with the BinaryReedMullerCode?() function from guava.py? Or should i just keep the guava.py implementation with a warning regarding it's depreciation?
Hello,
Some more comments:
 I noticed you add both
ReedMullerCode
,BinaryReedMullerCode
andQAryReedMullerCode
tocodes_catalog.py
. But with our design, onlyReedMullerCode
should be in this file, as the user should never access the classes themselves.
_multivariate_polynomial_interpolation
 You left a reference to
_R
in the docstring while it no longer exists Is the base field (and the cardinality) really needed? I would rather remove them from the list of inputs and recover them from
evaluation
. This way, one has to provideevaluation
as a vector over a finite field, you get this finite field by doingF = evaluation.base_ring()
, you check thatF
is the base ring ofpolynomial_ring
and that's it. You can getq
with the appropriate call onF
. As my previous comment implies, I would only allow
evaluation
to be a vector (and not a list). After all, it's a private method you only use to recover the original message as a polynomial from a codeword, and the latter is necessarily a vector, isn't it? l. 105: is there a way to avoid a call to
base_field.list()
? I know that in most cases, this list will be rather small, but it's not really efficient to build a full list (especially in a recursive pattern, because you will build it a lot of times...). This time again, I'd suggest an iterator over the finite field.
ReedMullerCode
 the
WARNING
block should be added in the module documentation as well. The syntax on this block is the following.. WARNING:: body of the blockSee
linear_code.py
, line 682 for an example. I would also change the message. I proposeFor now, this implementation only supports ReedMuller codes whose order is less than q
. Such a message implies it will be extended later, I think it's more positive this way ;)Qary and Binary classes
 I think it's not necessary to save
q
as a class parameter. The user can access it by typingC.base_field().cardinality()
, it's not something specific to the class (or something hard to "compute"), so it should not be saved.
 I prefer full names for useraccessed methods. I'm okay with
num_var
as an internal variable, but its getter should be callednumber_of_variables
instead.Vectorial encoder
 In
generator_matrix
, you're callingself.code()
a lot of times. I think you should store it in a local variable which will reduce the amount of calls to this method.Best,
David
comment:18 Changed 7 years ago by
So should we keep BinaryReedMullerCode?? in codes_catalog.py in keeping with the BinaryReedMullerCode??() function from guava.py? Or should i just keep the guava.py implementation with a warning regarding it's depreciation?
Oh, good point! Let's just keep BinaryReedMullerCode
in codes_catalog
for now, and we'll deal with it afterwards.
comment:19 Changed 7 years ago by
Replying to panda314:
Hey so regarding _multivariate_polynomial_interpolation
so when i recursively use the function inside itself i actually pass a list and not a vector. Anycase since it's a private function does it matter?
comment:20 Changed 7 years ago by
Well, I'm just thinking about the future: maybe we can export that later on in another part of Sage. But I guess the vector/list behaviour is ok for now, we can change it in the future if needed.
comment:21 Changed 7 years ago by
Commit:  5129c915c146098f2c9c03ce89ca992b32e86609 → cd09b8a4b1b36e956077e99fc9d67003e998df40 

comment:22 Changed 7 years ago by
Commit:  cd09b8a4b1b36e956077e99fc9d67003e998df40 → ad16c8f8dc11597d41fb4d933d4a02051bf29c01 

Branch pushed to git repo; I updated commit sha1. New commits:
ad16c8f  corecting a misplaced warning message

comment:23 followup: 25 Changed 6 years ago by
Hello,
Some other remarks (the last ones from me):
 in the input sanitization, you test integer parameters using this statement:
if not isinstance(param, Integer)
. Actually, as in Sage you can have both Sage Integers and Python ints, you need to writeif not isinstance(param, (int, Integer))
otherwise if the user passes a Python int forparam
it will be rejected by the check.
 in your
minimum_distance
method forQAryReedMullerCode
, you should use integer division//
instead of/
. I agree the result will always be an integer because of the parameters, but using/
returns a Rational while//
returns an Integer, and I think it's better to return an Integer.
 method
generator_matrix
does not work if order is 0:
sage: sage: F = GF(3) sage: C = codes.ReedMullerCode(F, 0, 2) sage: sage: E = codes.encoders.ReedMullerVectorEncoder(C) sage: sage: E.generator_matrix() BOOM (StopIteration)
In case of order 0, just return a matrix which has only one line of 1
(it is a repetition code in that case).
 Still on generator matrix, you're using
exponents.first()
andexponents.next()
. Why not using a "true" iterator instead? As you take the elements inSubsets
one by one following the order they've been sorted, an Iterator has a better complexity in that case, becausenext(obj)
's complexity isO(r)
wherer
is the rank ofobj
.
 Also,
generator_matrix
method is quite slow: with a RM code overGF(16)
, order 14, 3 variables, it takes around 30s to compute its generator matrix (while with a GRS code over GF(4096), dimension 680, it takes around 3 seconds). It's just a remark so we keep it in mind, and I do not request any changes for this ticket. I'm completely fine with your implementation as is for a first ticket, and we can work on efficiency later on if we want to!
 Same with
encode
inPolynomialEncoder
: some algorithms for fast multipoint evaluation exist, and we can use them to speed up this encoding. But once again, your implementation is definitely good enough for a first ticket!
Otherwise, it sounds good. It's quite an impressive work for your first ticket, well done!
You will be the first one to have his implementation of an "advanced" linear code family in Sage. I'm jealous ;)
David
comment:24 Changed 6 years ago by
Commit:  ad16c8f8dc11597d41fb4d933d4a02051bf29c01 → 260d8da2f65c87ecf0f3bb703608148684823d37 

Branch pushed to git repo; I updated commit sha1. New commits:
260d8da  using iterators to loop through

comment:25 Changed 6 years ago by
Replying to dlucas: the error with order 0 was because '.next()' function was not computing in the last element of the subsets (a problem if the dimension becomes equal to the total number of possible exponents) anyways replaced with iterator so that problem is gone. I was under the impression that .next() works like iterator. Anyways i used the python's iterator hopefully it will speed it up. Speed does seem to be a big issue with this implementation. I am unable to lay a finger on the exact problem causing that.
Well i found reed solomon pretty advanced :D . Feels good to see 'Copyright (C) 2016 Parthasarathi Panda <parthasarathipanda314@…>' :) . Thanks for all the help. Next stop decoding
Hello,
Some other remarks (the last ones from me):
 in the input sanitization, you test integer parameters using this statement:
if not isinstance(param, Integer)
. Actually, as in Sage you can have both Sage Integers and Python ints, you need to writeif not isinstance(param, (int, Integer))
otherwise if the user passes a Python int forparam
it will be rejected by the check.
 in your
minimum_distance
method forQAryReedMullerCode
, you should use integer division//
instead of/
. I agree the result will always be an integer because of the parameters, but using/
returns a Rational while//
returns an Integer, and I think it's better to return an Integer.
 method
generator_matrix
does not work if order is 0:sage: sage: F = GF(3) sage: C = codes.[wiki:ReedMullerCode](F, 0, 2) sage: sage: E = codes.encoders.[wiki:ReedMullerVectorEncoder](C) sage: sage: E.generator_matrix() BOOM (StopIteration)In case of order 0, just return a matrix which has only one line of
1
(it is a repetition code in that case).
 Still on generator matrix, you're using
exponents.first()
andexponents.next()
. Why not using a "true" iterator instead? As you take the elements inSubsets
one by one following the order they've been sorted, an Iterator has a better complexity in that case, becausenext(obj)
's complexity isO(r)
wherer
is the rank ofobj
.
 Also,
generator_matrix
method is quite slow: with a RM code overGF(16)
, order 14, 3 variables, it takes around 30s to compute its generator matrix (while with a GRS code over GF(4096), dimension 680, it takes around 3 seconds). It's just a remark so we keep it in mind, and I do not request any changes for this ticket. I'm completely fine with your implementation as is for a first ticket, and we can work on efficiency later on if we want to!
 Same with
encode
inPolynomialEncoder
: some algorithms for fast multipoint evaluation exist, and we can use them to speed up this encoding. But once again, your implementation is definitely good enough for a first ticket!Otherwise, it sounds good. It's quite an impressive work for your first ticket, well done! You will be the first one to have his implementation of an "advanced" linear code family in Sage. I'm jealous
;)
David
comment:26 Changed 6 years ago by
Branch:  u/panda314/classes_for_reed_muller_codes → u/dlucas/classes_for_reed_muller_codes 

comment:27 followup: 28 Changed 6 years ago by
Cc:  Johan Rosenkilde Julien Lavauzelle added 

Commit:  260d8da2f65c87ecf0f3bb703608148684823d37 → 13cda90026b28644ed1c018d7628ea1e007cd4ef 
Reviewers:  → David Lucas 
Status:  needs_work → needs_review 
Hello,
I made some minor changes to the documentation:
 There was some syntax error which prevented it to compile properly, which I fixed
 I removed duplicated warning blocks
 I changed some line breaks
 I removed trailing whitespaces in the process
 I simplified a few methods' docstring.
I did not change the code at all, just made some changes to the doc because it was faster to do it myself :)
.
If you agree with my changes, we can give this ticket a positive review!
Don't forget to write your full name in the Authors
field in this ticket.
Best,
David
New commits:
77b533d  Updated to 7.3beta3

13cda90  Some tweaks (mostly documentation)

comment:28 Changed 6 years ago by
Replying to dlucas: Seems fine :)
Hello,
I made some minor changes to the documentation:
 There was some syntax error which prevented it to compile properly, which I fixed
 I removed duplicated warning blocks
 I changed some line breaks
 I removed trailing whitespaces in the process
 I simplified a few methods' docstring.
I did not change the code at all, just made some changes to the doc because it was faster to do it myself
:)
.
If you agree with my changes, we can give this ticket a positive review! Don't forget to write your full name in the
Authors
field in this ticket.
Best,
David  New commits:
77b533d Updated to 7.3beta3
13cda90 Some tweaks (mostly documentation)
comment:29 Changed 6 years ago by
Authors:  → Parthasarathi Panda 

comment:31 Changed 6 years ago by
I know I'm late to the scene, but I have a few questions/comments:
 Why are lines 249255 in reed_muller_code.py formatted so badly? Similarly for line 7075. This type of formatting is not PEP8, I believe.
 The patch in
linear_code.py
addingInteger
aroundnk
shouldn't be necessary:n
isC.length()
andk
isC.dimension()
. Both should *always* returnInteger
. You should find the example that led you to make this patch and instead fix the root cause that made eithern
ork
rational. Incidentally, I think I know where the bug is: your_binomial_sum
returns a rational since you have a division that you don't cast toInteger
. Amusingly, that cast seems to speed up_binomial_sum
by a factor 3 or so.  Could you please improve the docstring of
order()
with a second sentence explaining what the order is? The current docstring is the type of infuriatingly unhelpful docstrings that are way too common ;)  Same for
number_of_variables
andminimum_distance
. Yes, I am aware that there are code classes with similarly unhelpful docstrings, but let's try to improve Sage :)  For
__repr__
etc., perhaps it be more consistent with ReedSolomon codes to write something like "ReedMuller of order X in Y variables overself.base_field()
".
comment:32 followup: 38 Changed 6 years ago by
 In
__eq__
you're checking equality of field cardinalities  just check equality of the fields.  In docstrings, use "ReedMuller" consistently (and not "reed muller" or other things).
 In docstring warning of
ReedMullerCode
, the first sentence seems to be redundant (and not strictly true, since ReedMuller codes make sense with order up to (d1)*q). The same goes for the warning inQaryReedMullerCode
. Also, all caps is ALWAYS ugly and impolite ;)  Can you please improve the documentation in
ReedMullerCode
to explain what aReedMullerCode
is? Also, the first paragraph of a docstring may only contain one sentence. Incidentally, inReedMullerCode
, I suggest removing the two other sentences in the first paragraph since that is an implementation detail and not relevant to the user.  The docstring of
ReedMullerCode
,num_of_var
: the(i.e. m)
doesn't make sense to the user. Perhaps it will when you add the description of what aReedMullerCode
is.  Perhaps add in the docstring of
QaryReedMullerCode
andBinaryReedMullerCode
a reference toReedMullerCode
saying that this should usually be used, and also to see that function for description of the codes.  There is some documentation missing about the order of the points used in
ReedMullerCode
,_multivariate_polynomial_interpolation
, etc.
comment:33 followup: 37 Changed 6 years ago by
 Could you please add to the doctest of
_multivariate_polynomial_interpolation
that the returned polynomial interpolates the input values?  Could you please add to the example blocks of
ReedMullerCode
something that prints the length and dimension and mindist of the code? That is good for paedagogy.  What happens in
_multivariate_polynomial_interpolation
if one inputs evaluations which cannot be interpolated by a polynomial of total degreeorder
? Document.  Consider using more mathematical docstring description rather than programminglike. See for instance
_multivariate_polynomial_interpolation
, where the first sentence could instead read something like `Return $f \in \GF(q)[X_1,...,X_m]$ such that $f(\mathbf a) = v[i(\mathbf a)]$ for all $\mathbf a \in \GF(q)^{m$, where $v \in GF(q)}{q^{m}$ is a given vector of evaluations, and $i(a)$ is a specific ordering of $GF(q)}m$ (see below for details)."  The input
num_of_var
to_multivariate_polynomial_interpolation
is redundant since you get the polynomial ring. Uselen(polynomial_ring.gens())
or something.  In
_multivariate_polynomial_interpolation
you misspelled "coordinate".  Instead of
Tuples(F.list(), m)
you should probably use(F^m).list()
.  Be careful with microoptimisations that end up not having a real impact but makes the code denser to read. An example is the use of
z
andx
in_multivariate_polynomial_interpolation
. In line 117 you could also just writepolyVector += [base_field.zero()]*(dlen(polyVector))
.  The doc of
_binomial_sum
sounds like it sums *up to and includingk
*. But it is only up to and includingk1
.
comment:34 followup: 35 Changed 6 years ago by
 In
ReedMullerVectorEncoder.generator_matrix
, you should iterate directly overexponents
:for exponent in exponents: ...
.  In
ReedMullerVectorEncoder.generator_matrix
and other places, perhapsbase_field_tuple
should simply be calledpoints
.  There should be a method to get the list of points used for evaluation.
 I don't think
ReedMullerVectorEncoder.generator_matrix
is correct: you are indexing in the pointx
usingexponent
. But you *should* be computingprod_{i=1}^m x[i]^exponent[i]
. Usingreduce(mul,...)
in this instance is less clear than usingprod(...)
.  In doc for
ReedMullerPolynomialEncoder
it says "polynomial field", but it should be "polynomial ring". It should say thatpolynomial_ring
is optional.  Copypaste error: it says something about
If code is not GRS code ...
.  The polynomial in the example of
ReedMullerPolynomialEncoder.encode
is badly formatted.  The failing example of
ReedMullerPolynomialEncoder.encode
with poly of too high degree could be with a poly where each variable has OK degree, but the total degree is bad, e.g.x0^2*x1
.
Otherwise, good job! Don't worry about the volume of my comments. David has fond memories of tickets where I similarly put him through the wringer ;)
Best, Johan
comment:35 followup: 36 Changed 6 years ago by
Replying to jsrn: No problem with comments :) . Will work on them. Some doubts/explanation.
 Well i guess they are so because of the length of line restrictions and the automated formatter did everything 'strictly'. 249255 is indeed ugly will change that. 7075 seems fine though.
 Oh. I guess i will remove the linear_code.py path and use '' instead of '/' in binomial sum. Might be faster too because of integer divisions instead of rational numbers.
 Isn't it sum upto 'k' that we need?
 Directly iterating involves enumeration over the subset redundantly if i am not mistaken. Used the iterator to do the generation in one scan.
 So given the term prod^m_{i=1} x^(d_i)_i, exponent is the set {{1}*d_1 {2}*d_2 ...{m}*d_m}. In which case i believe that the generator matrix is correct (it matches with polynomial evaluation). Will use the prod function instead.
Well everything else makes sense. Will implement them :)
 In
ReedMullerVectorEncoder.generator_matrix
, you should iterate directly overexponents
:for exponent in exponents: ...
. In
ReedMullerVectorEncoder.generator_matrix
and other places, perhapsbase_field_tuple
should simply be calledpoints
. There should be a method to get the list of points used for evaluation.
 I don't think
ReedMullerVectorEncoder.generator_matrix
is correct: you are indexing in the pointx
usingexponent
. But you *should* be computingprod_{i=1}^m x[i]^exponent[i]
. Usingreduce(mul,...)
in this instance is less clear than usingprod(...)
. In doc for
ReedMullerPolynomialEncoder
it says "polynomial field", but it should be "polynomial ring". It should say thatpolynomial_ring
is optional. Copypaste error: it says something about
If code is not GRS code ...
. The polynomial in the example of
ReedMullerPolynomialEncoder.encode
is badly formatted. The failing example of
ReedMullerPolynomialEncoder.encode
with poly of too high degree could be with a poly where each variable has OK degree, but the total degree is bad, e.g.x0^2*x1
.Otherwise, good job! Don't worry about the volume of my comments. David has fond memories of tickets where I similarly put him through the wringer ;)
Best, Johan
comment:36 Changed 6 years ago by
 Well i guess they are so because of the length of line restrictions and the automated formatter did everything 'strictly'. 249255 is indeed ugly will change that. 7075 seems fine though.
I won't be a stickler, so up to you. I looked up PEP8 and it doesn't seem to advocate putting parameters on individual lines.
 Oh. I guess i will remove the linear_code.py path and use '' instead of '/' in binomial sum. Might be faster too because of integer divisions instead of rational numbers.
Yes.
 Isn't it sum upto 'k' that we need?
Yes it is. Just clarify in the docstring ("up to k" is ambiguous in English).
 Directly iterating involves enumeration over the subset redundantly if i am not mistaken. Used the iterator to do the generation in one scan.
Doing for e in some_iterable
does exactly the same as looping with an iterator, memorywise. What you shouldn't do is for e in some_iterable.list()
.
 So given the term prod^m_{i=1} x^(d_i)_i, exponent is the set {{1}*d_1 {2}*d_2 ...{m}*d_m}. In which case i believe that the generator matrix is correct (it matches with polynomial evaluation). Will use the prod function instead.
Ah, yes now I see!
You should probably use exponents = Subsets(..., k=order)
. Your current code
depends in a fragile, undocumented manner on the order that Subsets
iterates
over its elements. When using k=order
you get only subsets of size exactly
k
, but you could get around this by adding a dummy element order
times to
the list.
Well everything else makes sense. Will implement them :)
Cool!
comment:37 followup: 40 Changed 6 years ago by
Replying to jsrn: Regarding '17.' Is there any way one can pass a sub ring of a multivariate polynomial ring consisting of polynomials over a subset of the variables? Like F[x_1,x_2,x_3] is to F[x_1,x_2,x_3,x_4]? The num_of_var parameter used in the function sort of does that.
 Could you please add to the doctest of
_multivariate_polynomial_interpolation
that the returned polynomial interpolates the input values? Could you please add to the example blocks of
ReedMullerCode
something that prints the length and dimension and mindist of the code? That is good for paedagogy. What happens in
_multivariate_polynomial_interpolation
if one inputs evaluations which cannot be interpolated by a polynomial of total degreeorder
? Document. Consider using more mathematical docstring description rather than programminglike. See for instance
_multivariate_polynomial_interpolation
, where the first sentence could instead read something like `Return $f \in \GF(q)[X_1,...,X_m]$ such that $f(\mathbf a) = v[i(\mathbf a)]$ for all $\mathbf a \in \GF(q)^{m$, where $v \in GF(q)}{q^{m}$ is a given vector of evaluations, and $i(a)$ is a specific ordering of $GF(q)}m$ (see below for details)." The input
num_of_var
to_multivariate_polynomial_interpolation
is redundant since you get the polynomial ring. Uselen(polynomial_ring.gens())
or something. In
_multivariate_polynomial_interpolation
you misspelled "coordinate". Instead of
Tuples(F.list(), m)
you should probably use(F^m).list()
. Be careful with microoptimisations that end up not having a real impact but makes the code denser to read. An example is the use of
z
andx
in_multivariate_polynomial_interpolation
. In line 117 you could also just writepolyVector += [base_field.zero()]*(dlen(polyVector))
. The doc of
_binomial_sum
sounds like it sums *up to and includingk
*. But it is only up to and includingk1
.
comment:38 Changed 6 years ago by
Replying to jsrn: In '11.' is there any way to hyperlink the reference to ReedMullerCode? in the reference of QAryReedMullerCode and BinaryReedMullerCode?? Or shall i just write somehing like 'refer to documentation of' ?
 In
__eq__
you're checking equality of field cardinalities  just check equality of the fields. In docstrings, use "ReedMuller" consistently (and not "reed muller" or other things).
 In docstring warning of
ReedMullerCode
, the first sentence seems to be redundant (and not strictly true, since ReedMuller codes make sense with order up to (d1)*q). The same goes for the warning inQaryReedMullerCode
. Also, all caps is ALWAYS ugly and impolite ;) Can you please improve the documentation in
ReedMullerCode
to explain what aReedMullerCode
is? Also, the first paragraph of a docstring may only contain one sentence. Incidentally, inReedMullerCode
, I suggest removing the two other sentences in the first paragraph since that is an implementation detail and not relevant to the user. The docstring of
ReedMullerCode
,num_of_var
: the(i.e. m)
doesn't make sense to the user. Perhaps it will when you add the description of what aReedMullerCode
is. Perhaps add in the docstring of
QaryReedMullerCode
andBinaryReedMullerCode
a reference toReedMullerCode
saying that this should usually be used, and also to see that function for description of the codes. There is some documentation missing about the order of the points used in
ReedMullerCode
,_multivariate_polynomial_interpolation
, etc.
comment:39 Changed 6 years ago by
In '11.' is there any way to hyperlink the reference to ReedMullerCode?? in the reference of QAryReedMullerCode and BinaryReedMullerCode??? Or shall i just write somehing like 'refer to documentation of' ?
You can write: :class:<backquote>BinaryReedMullerCode<backquote>
, it will create an hyperlink to BinaryReedMullerCode
.
If you want to refer a class in another file in coding, you can use:
:class:<backquote>sage.coding.file.ClassName<backquote>
And to hyperlink a method, use :meth:
instead of :class:
.
If you don't want to check manually if the links work, you can
use the command ./sage docbuild warnlinks reference/coding
which will rebuild the documentation and tell you if some links are broken.
comment:40 Changed 6 years ago by
Replying to panda314:
Regarding '17.' Is there any way one can pass a sub ring of a multivariate polynomial ring consisting of polynomials over a subset of the variables? Like F[x_1,x_2,x_3] is to F[x_1,x_2,x_3,x_4]? The num_of_var parameter used in the function sort of does that.
Ah yes, ok that's a good point. You *can* make such a subring but perhaps that could get inefficient. Instead, you can restructure _multivariate_polynomial_interpolation
with a local function:
def _multivariate_polynomial_interpolation(evaluations, order, polynomial_ring): def _interpolate(evaluations, order, num_of_var): ... for k in range(d): # computing the polynomial poly = poly + z * _interpolate([multipoint_evaluation_list[i][k] for i in range(n_by_q)], order  k, num_of_var  1) z = z * x return poly return _interpolate(evaluations, order, len(polynomial_ring.gens()))
comment:41 Changed 6 years ago by
Branch:  u/dlucas/classes_for_reed_muller_codes → u/panda314/classes_for_reed_muller_codes 

comment:42 followup: 43 Changed 6 years ago by
Commit:  13cda90026b28644ed1c018d7628ea1e007cd4ef → 5ec9c6c0a2fd6bdb3978a53d81b440efef41614a 

Hi, So have the changes been reviewed? Is the code fine?
New commits:
5ec9c6c  rewriting doumentation, and rehformatting some part of the code

comment:43 Changed 6 years ago by
Reviewers:  David Lucas → David Lucas, Johan S. R. Nielsen 

So have the changes been reviewed? Is the code fine?
Please be patient :) And for the record: when you push some commits to reply to a reviewer's concerns, please comment on your commit. At the very least, write something like "I fixed all the things you asked for. Ready for review again" or something.
I probably won't have time to go over it today. If David has time and is happy with your modifications to my comments, I'm sure that's fine.
comment:44 Changed 6 years ago by
Branch:  u/panda314/classes_for_reed_muller_codes → u/dlucas/classes_for_reed_muller_codes 

comment:45 followup: 46 Changed 6 years ago by
Commit:  5ec9c6c0a2fd6bdb3978a53d81b440efef41614a → e3f81ec532cff5a5553f8abfddfc76c0a433c0e0 

Hello,
It seems that my comment did not appear  sorry about that.
I fixed some small doc issues, rewrote a few docstrings, and removed a duplicated WARNING
block.
If you're fine with my changes, you can give a positive review, everything is fine on my side.
Once again, well done with these codes :)
David
New commits:
0f1f58e  Merge branch 'u/panda314/classes_for_reed_muller_codes' of git://trac.sagemath.org/sage into classes_for_reed_muller_codes

e3f81ec  Fixed errors in documentation, rewrote some sentences, changed formatting, made some extra changes according to the other reviewer's comments

comment:46 Changed 6 years ago by
Replying to dlucas: tests are running fine. Seems cool :)
Hello,
It seems that my comment did not appear  sorry about that.
I fixed some small doc issues, rewrote a few docstrings, and removed a duplicated
WARNING
block.If you're fine with my changes, you can give a positive review, everything is fine on my side.
Once again, well done with these codes :)
David
New commits:
0f1f58e Merge branch 'u/panda314/classes_for_reed_muller_codes' of git://trac.sagemath.org/sage into classes_for_reed_muller_codes
e3f81ec Fixed errors in documentation, rewrote some sentences, changed formatting, made some extra changes according to the other reviewer's comments
comment:48 Changed 6 years ago by
Status:  needs_review → positive_review 

comment:49 Changed 6 years ago by
Status:  positive_review → needs_work 

Merge conflict, please merge in next beta.
comment:50 Changed 6 years ago by
Commit:  e3f81ec532cff5a5553f8abfddfc76c0a433c0e0 → 3cce07f24554b489738c721584a689df65d8655a 

Branch pushed to git repo; I updated commit sha1. New commits:
3cce07f  Updated to latest beta and fixed conflicts

comment:51 followup: 52 Changed 6 years ago by
Status:  needs_work → needs_review 

Merge conflict, please merge in next beta.
Done, ticket open for review.
David
comment:52 Changed 6 years ago by
Replying to dlucas:
Merge conflict, please merge in next beta.
Seems fine. Tests passed.
Done, ticket open for review.
David
comment:53 Changed 6 years ago by
Seems fine. Tests passed.
Cool!
Can you please set it to positive_review
then?
comment:54 Changed 6 years ago by
Status:  needs_review → positive_review 

comment:55 Changed 6 years ago by
Branch:  u/dlucas/classes_for_reed_muller_codes → 3cce07f24554b489738c721584a689df65d8655a 

Resolution:  → fixed 
Status:  positive_review → closed 
comment:56 Changed 6 years ago by
Commit:  3cce07f24554b489738c721584a689df65d8655a 

Reviewers:  David Lucas, Johan S. R. Nielsen → David Lucas, Johan Sebastian Rosenkilde Nielsen 
Replying to panda314:
New commits:
adding ReedMullerCode.py containing support for encoding of Reed Muller Codes
Merge branch 'RMCode' into t/20705/classes_for_reed_muller_codes