Opened 6 years ago
Closed 6 years ago
#19623 closed enhancement (fixed)
Syndrome decoder is not a syndrome decoder
Reported by:  dlucas  Owned by:  

Priority:  major  Milestone:  sage7.1 
Component:  coding theory  Keywords:  
Cc:  jsrn, cpernet, jlavauzelle  Merged in:  
Authors:  David Lucas  Reviewers:  Julien Lavauzelle, Johan Sebastian Rosenkilde Nielsen 
Report Upstream:  N/A  Work issues:  
Branch:  2bef528 (Commits, GitHub, GitLab)  Commit:  2bef528d821af2b822fbca47d4f4928f2b0131ad 
Dependencies:  Stopgaps: 
Description
The current implementation of syndrome decoder for linear codes actually uses a nearest neighbor decoding algorithm.
It should be rewritten.
Change History (49)
comment:1 Changed 6 years ago by
 Branch set to u/dlucas/generic_decoders
comment:2 Changed 6 years ago by
 Commit set to c834e3c48bf99a86762f61434190ec8aba202db4
 Milestone changed from sage6.10 to sage7.0
 Status changed from new to needs_review
comment:3 Changed 6 years ago by
 Commit changed from c834e3c48bf99a86762f61434190ec8aba202db4 to ac985d8e5f09b21465e3b805f406e8b4014a5a5e
Branch pushed to git repo; I updated commit sha1. New commits:
ac985d8  Update to 7.0rc1

comment:4 Changed 6 years ago by
I updated this ticket to the latest beta. This is still open for review.
comment:5 Changed 6 years ago by
 Commit changed from ac985d8e5f09b21465e3b805f406e8b4014a5a5e to fc79446d6948143d32a9065410453379c79ef2b2
comment:6 Changed 6 years ago by
I fixed a few bugs:
 the check related to lexicographic order in
_build_lookup_table
was wrong, as it ignored the weight of the vectors. Now the vector with the smallest weight is kept, and if both vectors have the same weight, the smallest in lexicographic order is kept.  I added an early termination condition on the loop to fill the table, which triggers if no vector was added to the table during a full iteration.
 I also added an early termination to
decode_to_code
: if received vector's syndrome equals to zero, it does not contain any error and thus can be returned as is.  Finally, I added checks over the input, especially, it is no longer possible to take
number_errors
>code.length()
.
This is still open for review,
David
comment:7 Changed 6 years ago by
Quick remark
 Since the covering radius of a linear code is always at most
nk
, does it make any sense to let the table search beyond that?
 Did you prove that if there are no coset leaders of weight
t
, then there are no coset leaders oft+1
? (i.e. that your early termination is sound)
comment:8 Changed 6 years ago by
Since the covering radius of a linear code is always at most nk, does it make any sense to let the table search beyond that?
Not really. I'll change this.
Did you prove that if there are no coset leaders of weight t, then there are no coset leaders of t+1? (i.e. that your early termination is sound)
It comes from theorem 1.12.6, prop. 5 (page 51) in Huffman and Pless
Theorem 1.12.6 Let C be an [n, k] code over F q . Let C* a code obtained from C by puncturing on some coordinate. The following hold:
(v) Assume that x is a coset leader of C. If x' ∈ (F_q)^n
all of whose nonzero components agree
with the same components of x, then x' is also a coset leader of C. In particular, if
there is a coset of weight s, there is also a coset of any weight less than s.
comment:9 followups: ↓ 10 ↓ 11 Changed 6 years ago by
 Cc jlavauzelle added
 Reviewers set to Julien Lavauzelle
 Status changed from needs_review to needs_work
I add some new remarks:
 I think you could remove your
_list_all_error_values
method. For each step of the main loop (i.e. for a given error weighti
), you only needPermutations(l, i)
, withl
equals toi
times the list of nonzero elements of GF(q). Thus you can save a lot of memory, using the both iterators onPermutations
and onSubsets
.  I'm not completely sure of what I'm saying now, but you can notice that the ordering induced by the "composition" of these iterators respects the one you want for updating the table (i.e. first hamming weight, then lexicographic order). So in theory, you might never enter these blocks:
if (e.hamming_weight() == e_cur.hamming_weight() and e < e_cur): stop = False lookup[s] = copy(e) elif e.hamming_weight() < e_cur.hamming_weight(): stop = False lookup[s] = copy(e)
 In
decode_to_code()
, you talk about decoding into the message space, but I think you meant into the code instead. In the same function, you may handle the cases.is_zero()
before building the lookup table.  I don't really understand the difference between
number_errors()
anddecoding_radius()
. Is it a tight bound on the maximum number of errors the decoder can decode ? Or just a way to remember what the user has set in thenumber_errors
parameter of the constructor ? Maybe you could make them more explicit, likeerror_max_weight
.  I also think that your description of the decoder ("correcting up to
number_errors
errors") could fool the user, asnumber_errors
may be set up ton
whereas the decoder will never decode an error of weightn
.
comment:10 in reply to: ↑ 9 ; followup: ↓ 12 Changed 6 years ago by
Replying to jlavauzelle:
 I also think that your description of the decoder ("correcting up to
number_errors
errors") could fool the user, asnumber_errors
may be set up ton
whereas the decoder will never decode an error of weightn
.
Indeed, we seem to have an issue with what decoding_radius
should mean. Should it here be min(number_errors, halftmindist)
, or should it just be number_errors
, with the understanding that it chooses the nearest neighbour?
This should also be reflected in the decoder_type
. (btw. is this even set for this decoder?!).
comment:11 in reply to: ↑ 9 ; followup: ↓ 13 Changed 6 years ago by
Replying to jlavauzelle:
I add some new remarks:
 I think you could remove your
_list_all_error_values
method. For each step of the main loop (i.e. for a given error weighti
), you only needPermutations(l, i)
, withl
equals toi
times the list of nonzero elements of GF(q). Thus you can save a lot of memory, using the both iterators onPermutations
and onSubsets
.
True enough. I'll made the changes.
 I'm not completely sure of what I'm saying now, but you can notice that the ordering induced by the "composition" of these iterators respects the one you want for updating the table (i.e. first hamming weight, then lexicographic order). So in theory, you might never enter these blocks:
if (e.hamming_weight() == e_cur.hamming_weight() and e < e_cur): stop = False lookup[s] = copy(e) elif e.hamming_weight() < e_cur.hamming_weight(): stop = False lookup[s] = copy(e)
I have the same feeling, I'll perform some tests to verify this (empirically), and I will remove these checks if they prove themselves useless.
 In
decode_to_code()
, you talk about decoding into the message space, but I think you meant into the code instead. In the same function, you may handle the cases.is_zero()
before building the lookup table.
Indeed, this is a copypaste mistake. Regarding the handling of is_zero
, I'm not sure that's the best behaviour. I somehow feel the user still wants the table to be built, as, after all, the user will use this decoder to actually correct errors. So, even if it's not used when decoding a vector with a zero syndrome, I still feel it should be built on the first decoding attempt, whichever the syndrome of the word to decode. I'm not strongly for this solution though, so if you think it's a bad idea, I'm ok to drop it.
 I don't really understand the difference between
number_errors()
anddecoding_radius()
. Is it a tight bound on the maximum number of errors the decoder can decode ? Or just a way to remember what the user has set in thenumber_errors
parameter of the constructor ? Maybe you could make them more explicit, likeerror_max_weight
.
Well, it's more a design choice that a question of bounds here, as number_errors
is a proper getter method for the number of errors chosen at construction time, while decoding_radius
is a method we have for every decoder (got from the abstract class Decoder
) and made for the user to catch the maximal number of errors the user's decoder can decode. So, the user should not use number_errors
, it's more the internal getter to avoid ugly direct calls like self._number_errors
. But you're right, it's somehow confusing... What about changing number_errors
into a private method?
 I also think that your description of the decoder ("correcting up to
number_errors
errors") could fool the user, asnumber_errors
may be set up ton
whereas the decoder will never decode an error of weightn
.
Indeed. That's why I added a note regarding this issue in the class' documentation. Rereading it, I should make it more explicit though, as it does not state clearly that if you go further a certain number of errors t
, you probably won't get the original codeword in which you got t
errors.
Regarding the descriptive message, well, I just wanted it to remind all the input parameters the user set. I'll try to find a better way to say it.
David
comment:12 in reply to: ↑ 10 ; followup: ↓ 14 Changed 6 years ago by
Replying to jsrn:
Replying to jlavauzelle:
 I also think that your description of the decoder ("correcting up to
number_errors
errors") could fool the user, asnumber_errors
may be set up ton
whereas the decoder will never decode an error of weightn
.Indeed, we seem to have an issue with what
decoding_radius
should mean. Should it here bemin(number_errors, halftmindist)
, or should it just benumber_errors
, with the understanding that it chooses the nearest neighbour?
Forget my previous answer, I vote for your solution : number errors returns _number_errors
as it was set at construction time, and decoding_radius
returns min(number_errors, halfmindist)
This should also be reflected in the
decoder_type
. (btw. is this even set for this decoder?!).
Yes it is, for now it's:
LinearCodeSyndromeDecoder._decoder_type = {"harddecision", "unique", "alwayssucceed", "complete"}
It's not shown in the diff because it was set like this before, and I did not change it.
comment:13 in reply to: ↑ 11 Changed 6 years ago by
Replying to dlucas:
 I'm not completely sure of what I'm saying now, but you can notice that the ordering induced by the "composition" of these iterators respects the one you want for updating the table (i.e. first hamming weight, then lexicographic order). So in theory, you might never enter these blocks:
if (e.hamming_weight() == e_cur.hamming_weight() and e < e_cur): stop = False lookup[s] = copy(e) elif e.hamming_weight() < e_cur.hamming_weight(): stop = False lookup[s] = copy(e)I have the same feeling, I'll perform some tests to verify this (empirically), and I will remove these checks if they prove themselves useless.
It's true, e_cur
will always be less than e
in the lexicographical order (including having lower weight).
I have a question: why is that loop written in this bizarre way? I mean, wy not do a for nerrors in range(1, t+1):
and then call break
within the loop if you need to jump out early?
I agree with Julian that _list_all_error_values
is memorywise inefficient.
But furthermore, I don't understand your use of Permutation
? What you need is cartesian_product
.
I think your decode_to_code
*modifies* the input vector without copying it. This is very bad!
Best, Johan
comment:14 in reply to: ↑ 12 Changed 6 years ago by
Replying to dlucas:
Replying to jsrn:
Replying to jlavauzelle:
 I also think that your description of the decoder ("correcting up to
number_errors
errors") could fool the user, asnumber_errors
may be set up ton
whereas the decoder will never decode an error of weightn
.Indeed, we seem to have an issue with what
decoding_radius
should mean. Should it here bemin(number_errors, halftmindist)
, or should it just benumber_errors
, with the understanding that it chooses the nearest neighbour?Forget my previous answer, I vote for your solution : number errors returns
_number_errors
as it was set at construction time, anddecoding_radius
returnsmin(number_errors, halfmindist)
Hmm, I'm not sure this is my solution ;) There are decoding methods which decode up to, say t > d/2 errors, but which might fail with low probability (interleaved codes, power decoding, ao.). Should decoding_radius
really return d/2
for all these? Perhaps it's the definition of decoding_radius
which should change. In this case it's "the maximal number of errors a received word can have and for which the decoder will always return a most likely codeword".
Yes it is, for now it's:
LinearCodeSyndromeDecoder._decoder_type = {"harddecision", "unique", "alwayssucceed", "complete"}It's not shown in the diff because it was set like this before, and I did not change it.
OK, I missed it when browsing the local file then. "complete" should only be
added if one sets number_errors
at least the covering radius of the code.
comment:15 Changed 6 years ago by
 Commit changed from fc79446d6948143d32a9065410453379c79ef2b2 to b2b9e8270a95e7478760532827be04cecb906510
Branch pushed to git repo; I updated commit sha1. New commits:
f1b61af  Update to 7.1beta0

026aa4d  Changed methods and docstrings related to errors

0f2ed54  Rewrote _list_all_error_values

23c06f6  Rewrote _build_lookup_table

4e13857  input is no longer modified in place in decode_to_code

b2b9e82  Fixed broken doctests and did a few small tweaks

comment:16 followup: ↓ 17 Changed 6 years ago by
Hello,
I changed the following in my code:
 I renamed
number_errors
asmaximum_error_weight
. I changed accordingly the description given by representation methods_latex_
and_repr_
. I also renamed its getter (fromnumber_errors
tomaximum_error_weight
), and changed its docstring. covering_radius
now returnsmin(maximum_error_weight, halfmindist
). I changed its docstring and added a note to explain it performs a minimum distance computation, which can be long... I rewrote
_list_all_error_values
, which now takes an input parameterweight
and builds only the exhaustive list of error values of weightweight
. It now usescartesian_product
instead ofPermutation
.  I rewrote
_build_lookup_table
, which now uses afor
loop. I removed the useless lexicographic order check. It still terminates early if no new pattern is added during a loop.  The input of
decode_to_code
is now copied before modification.  Finally, I fixed a few broken doctests, and removed the keyword
complete
indecoder_type
as computing the covering radius can only be done by a call to a method forcing the use of optional library Guava. Ticket #19913 proposes a newcovering_radius
method which has a Python alternative to Guava. Once this ticket (or #19913) gets reviewed, I'll do the appropriate updates wrt. covering radius in the remaining one.
Sorry about this poorly written code, it should be better now.
David
comment:17 in reply to: ↑ 16 Changed 6 years ago by
Hi
covering_radius
now returnsmin(maximum_error_weight, halfmindist
). I changed its docstring and added a note to explain it performs a minimum distance computation, which can be long...
You mean decoding_radius
I'm sure :) But you completely misunderstood what I wrote before, I'm afraid. By your updated docstring, then clearly decoding_radius
should return maximum_error_weight
.
I just had a pretty cool idea: When building the lookuptable you're actually *computing the minimum distance*! If t
is the lowest weight at which you detect a collision, i.e. see a syndrome that has already occurred before, then the minimum distance is 2t
or 2t1
(you can determine which it was by inspecting the weight of the errors that collided). In any case, "half the minimum distance decoding" will be t
errors.
If maximum_error_weight < t
then _build_lookup_table
will never see a collision, and so it can *know* that the minimum distance is greater. Hence, it can know it is a bounded distance decoder. If maximum_error_weight = t
, then a collision will be seen at this weight, and so the decoder can know it is a minimum distance decoder. If maximum_error_weight > t
, then it knows that it will sometimes fail, but it will mostly decode up to min(maximum_error_weight, covering_radius)
(see below). There should be some appropriate keywords in type
distinguishing these cases.
Once the infrastructure we talked about previously is in place, the Code
could be informed of this minimum distance.
In the same vein, you're *also* computing the covering radius. For if w+1
is the first weight where *every* error weight collides with a lowerweight error, then w
is the covering radius.
Now, if maximum_error_weight < w
, then _build_lookup_table
will never see this event, and so it clearly doesn't determine the covering radius. In that case, maximum_error_weight
is truly the decoding radius (except that if maximum_error_weight >= d/2$
then it might sometime return another minimum weight error, and not the one that was added).
If maximum_error_weight > w
, then we *know* that incorrect decoding will be performed for *all* error patterns of weight > w
. So in maximum_error_weight()
and the string representations, and elsewhere, should we really return the usersupplied maximum_error_weight
and not w
? It's not terribly important to me, except that it should be possible for the user to retrieve w
(possibly by informing Code
of this covering radius and the user then calls Code.covering_radius
). I would, though, posit that in case maximum_error_weight
is used instead of w
, then maximum_error_weight
should be capped at nk
at construction time. Higher than that clearly makes no sense.
 I rewrote
_list_all_error_values
, which now takes an input parameterweight
and builds only the exhaustive list of error values of weightweight
. It now usescartesian_product
instead ofPermutation
.
There's no need to return a list
. Just return the cartesian_product
object. And in the inner for
loop of _build_lookup_table
, instead of looping over j
, do a for error in error_values_of_weight[t]
or something.
In Python, a list
is concretely instantiated and takes space in memory. A
generating_expression
takes next to no memory. When you can, use
generating_expression
.
We're down to aesthetics now, but wouldn't it be nicer to inline _list_all_error_values
so you don't need to construct l
time and again? You could e.g. make a generating expression which generates the generating expressions:
error_position_tables = [ cartesian_product([l]*weight) for weight in range(maximum_error_weight) ]
 Finally, I fixed a few broken doctests, and removed the keyword
complete
indecoder_type
as computing the covering radius can only be done by a call to a method forcing the use of optional library Guava. Ticket #19913 proposes a newcovering_radius
method which has a Python alternative to Guava. Once this ticket (or #19913) gets reviewed, I'll do the appropriate updates wrt. covering radius in the remaining one.
By my arguments above, you can add complete
whenever the covering radius was determined during _build_lookup_table
.
Also, I just noticed that __eq__
is wrong: it doesn't compare maximum_error_weight
.
Best, Johan
comment:18 Changed 6 years ago by
Thank you for your comments.
But now another problem arises, namely how to design a generic mechanism to inform codes about some of their "properties".
Consider you have a code and you perform syndrome decoding.
If you already computed its minimum distance, you can fill decoder_type
at construction time. If you did not, you will fill it while building the table of syndromes and you have to inform the code that it now knows its minimum distance.
This can be done for instances of LinearCode
as they have a class argument minimum_distance
but how to do that properly for other codes (e.g. GeneralizedReedSolomonCode
) that do not possess this class parameter?
The same issue arises with covering radius, but for any code this time, as there is no class with a _covering_radius
parameter.
I might again be completely missing the spot here, but it does not appear to me as a simple mechanism to implement, especially if we want something generic. I'm working on it, so for now
I did not change maximum_error_weight
nor decoding_radius
.
All the other requested changes has been made. Leaving this in needs_work
for now.
comment:19 Changed 6 years ago by
 Commit changed from b2b9e8270a95e7478760532827be04cecb906510 to 54c02c3af22e896dd03759408b989f493415f26a
comment:20 Changed 6 years ago by
Hi David,
I agree, David, that the "informing the code"feature is advanced and shouldn't be done in this ticket. Sorry for not making that clear.
So, firstly, I want to say that my comments actually make sense even as a completely internal mechanism of SyndromeDecoder
. That is, harvesting the true minimum distance and covering radius while the lookuptable is being build enables the decoder to give useful feedback to the user, such as its decoding radius and a more precise decoder type.
Secondly, I hope you realise that this is "just" yet another example of the usefulness of a general mechanism for specifying upper and lowerbounds on the minimum distance of a code, and it shows that, indeed as could have been expected, that minimum distance is *not* special in this regard: there are other parameters for which me might want such a mechanism (in this case, covering radius).
To be more precise, say we design a general system for informing a LinearCode
that new knowledge has been obtained on a property it has. It could be something like (and now I'm totally making this up on the spot):
`LinearCode._update_property_bound(property_name, update_type, value)
With such a mechanism, SyndromeDecoder
would call
`self._code._update_property_bound("minimum_distance", "upper", t) `self._code._update_property_bound("minimum_distance", "lower", t)
or simply
`self._code._update_property_bound("minimum_distance", "exact", t)
Then the function minimum_distance
should be changed accordingly. And various functions for computing or bounding the minimum distance should also be changed accordingly.
As we have discussed before, we should be very careful not completely overengineering this. I think it might be useful to start compiling a list of examples where such bound computation is possible and could be useful.
comment:21 followup: ↓ 22 Changed 6 years ago by
So, firstly, I want to say that my comments actually make sense even as a completely internal mechanism of SyndromeDecoder?. That is, harvesting the true minimum distance and covering radius while the lookuptable is being build enables the decoder to give useful feedback to the user, such as its decoding radius and a more precise decoder type.
Of course they are, I'm perfectly aware of that! I was not criticising the usefulness of it, I was just saying it seems quite difficult and a bit out of purpose in this ticket.
Secondly, I hope you realise that this is "just" yet another example of the usefulness of a general mechanism for specifying upper and lowerbounds on the minimum distance of a code, and it shows that, indeed as could have been expected, that minimum distance is *not* special in this regard: there are other parameters for which me might want such a mechanism (in this case, covering radius).
Definitely. dimension
should be added to this list of lower and upperbounds...
I realized that while working on shortened codes (and subfield subcodes as well). Well, this mechanism is actually useful with any code manipulation class.
For the mechanism, why not working on a set of decorators? Or maybe a generic decorator with arguments, like bound
which can be set to lower
, upper
or exact
and an argument parameter
, which can be set to minimum_distance
, dimension
or covering_radius
?
Once properly implemented, it should be easy to use, and very flexible (adding new parameters would be very simple).
Note it's just an idea in the wind, I'm definitely no expert in decorators, I just read the documentation in the bus 20 min ago ;)
But it seems feasible. Anyway, I opened a followup ticket #19959.
comment:22 in reply to: ↑ 21 Changed 6 years ago by
Of course they are, I'm perfectly aware of that! I was not criticising the usefulness of it, I was just saying it seems quite difficult and a bit out of purpose in this ticket.
You were saying a lot on how the codeparameter mechanism had issues and was difficult, and possibly implied that it was out of place. You didn't comment on the fact that my suggestions still made sense for SyndromeDecoder
completely isolatedly.
But I do realise that this ended up being a larger kettle of fish than we originally intended. In case you're against implementing the decoding radius and covering radius thing, *internally in SyndromeDecoder?* for now, I would say the only sensible semantics/documentation are:
decoding_radius
returnsmaximum_error_weight
. It's main docstring remains as it is, while the Note is, of course, removed. Possibly add a Note saying something like "When correcting beyond half the minimum distance, correct unique decoding is not always possible. This decoder returns a codeword of minimum distance to the received. In particular, ifmaximum_error_weight
is beyond the covering radius of the code, decoding of this many errors is guaranteed to be incorrect.".
maximum_error_weight
is capped atnk
in the constructor.
On an unrelated note, I also noticed: The docstring of SyndromeDecoder
should explain in 23 lines what the decoder does. There's a "a" missing before "syndrome" in the first line.
Definitely.
dimension
should be added to this list of lower and upperbounds...
Yes, I remember now.
For the mechanism, why not working on a set of decorators? Or maybe a generic decorator with arguments, like
bound
which can be set tolower
,upper
orexact
and an argumentparameter
, which can be set tominimum_distance
,dimension
orcovering_radius
? Once properly implemented, it should be easy to use, and very flexible (adding new parameters would be very simple).Note it's just an idea in the wind, I'm definitely no expert in decorators, I just read the documentation in the bus 20 min ago
;)
But it seems feasible. Anyway, I opened a followup ticket #19959.
Hmm, yes it's an interesting idea. I could imagine that the decorator be backed by some functions that should be callable directly. What we discussed on this ticket would likely use those functions and not a decorator (since SyndromeDecoder
is another object than the code itself, and since _build_lookup_table
has the potential to discover 0, 1 or 2 parameters of the Code
.
Best, Johan
comment:23 Changed 6 years ago by
Sorry if wrote my answer in a confusing manner, I'm not against implementing the mechanism to discover the minimum distance and the covering radius and dynamically update decoder type, I was against implementing the generic mechanism here. Your suggestions definitely make sense here, I just jumped directly into the bigger picture instead of focusing on this instance of the problem. My bad! I think we actually agree on this. I'll do that in the morrow.
Hmm, yes it's an interesting idea. I could imagine that the decorator be backed by some functions that should be callable directly.
That's what I have in mind. I read more carefully the documentation, it seems interesting to do, and definitely possible. The reason I thought about decorators in the first place is because, we, as usual, want a mechanism which has the best genericity/simplicity ratio. And with a decorator, if one wants to implement his own methods, one only has to add a decorator over some specific methods, which is definitely simple! Well, if you (or Julien, or anyone) can see a better mechanism, please tell me!
David
comment:24 Changed 6 years ago by
Another question:
For now, _build_lookup_table
is called on the first decoding attempt, and not at construction time.
I made this choice, because, as a user, I prefer objects I use to be constructed quickly, and then spend some time when calling methods over this objects rather than waiting at construction time. It's a matter of taste I guess. Also, I might be working in a very strange way for which I need to construct decoders but not using their decoding methods. From this point of view, the faster the better.
Anyway, here we have the following behaviour:
 I build my decoder.
 If I call directly
maximum_error_weight()
,decoding_radius()
or representation methods, it will return me results based on either the value I chose formaximum_error_weight
ornk
depending on which is the lowest.  Then I perform some decoding, call the methods I listed above again, and their output might have change... The same occurs with
decoder_type()
.
From the user point of view, it can be quite confusing (imho).
So, wouldn't it be better to build the lookup table at construction time, so "internal" minimum_distance
and covering_radius
might be found (if possible) and results updated accordingly?
In that configuration, whenever I call representations methods, decoder_type
and so on, their output will remain the same.
What do you think?
comment:25 Changed 6 years ago by
 Commit changed from 54c02c3af22e896dd03759408b989f493415f26a to 4d70e2c87e7c2e50171ee24ca7c6200ff29d0250
Branch pushed to git repo; I updated commit sha1. New commits:
4d70e2c  Implemented internal discovery of code parameters and autoupdate of decoder types

comment:26 followup: ↓ 27 Changed 6 years ago by
 Milestone changed from sage7.0 to sage7.1
 Status changed from needs_work to needs_review
Hello,
I did the required changes:
covering_radius
andminimum_distance
for the code are now internally saved whenever found,decoder_type
is updated accordingly,decoding_radius
andmaximum_error_weight
likewise. The lookup table is now computed at construction time to solve the problem presented in comment 24.
 I also added
"dynamic"
to the list of types for this decoder.
I made some naming choices for the new decoder types here, which I like you to discuss, especially "might_create_decoding_error"
which is awful. I did not have a better idea, sorry...
I commented my code as best as possible, because the chain of if
statements at the end of _build_lookup_table
is pretty ugly.
I chose to not write getter methods for the internal minimum_distance
and covering_radius
, because depending on how far it went into the computation, it can fill these fields  or not. I preferred these to be properly filled when #19959 will be ready instead of implementing something clumsy. But I think this is somehow bizarre, as we might get useful information and not pass it to the user. This choice is of course open to discussion.
By the way, this question is more or less related, does anyone remember what a complete
decoder is? I cannot remember this... Which once again proves I need to write a proper list of these types in the introductory thematic tutorial.
Best,
David
comment:27 in reply to: ↑ 26 Changed 6 years ago by
Hi,
Sounds cool! I'll take a closer look later on.
I made some naming choices for the new decoder types here, which I like you to discuss, especially
"might_create_decoding_error"
which is awful. I did not have a better idea, sorry...By the way, this question is more or less related, does anyone remember what a
complete
decoder is? I cannot remember this... Which once again proves I need to write a proper list of these types in the introductory thematic tutorial.
Yeah, might_create_decoding_error
is pretty bad :) complete
refers to the fact that it decodes every word in the ambient space. It's more or less a synonym of one of the meanings of "maximumlikelihood", I guess, whenever the decoder is also unique
. I.e. Syndrome decoder is complete
when the decoding radius is the covering radius.
There is a typediscussion on this Issue on our bitbucket repo: https://bitbucket.org/lucasdavid/sage_coding_project/issues/40/decoderintrospection
The decoder you're doing now is similar to Power decoding of RS codes (which I gave types harddecision
, unique
, mightfail
): both decoders give a unique answer and return a closest codeword. However, Power decoding "might fail" in the sense that it doesn't give *any answer at all*. While SyndromeDecoder? "might fail" in the sense that it gives you *another* close codeword than what you sent. As opposed to this, GuruswamiSudan "always succeeds" in the sense that *your* codeword is guaranteed to be on the list of outputs (up to the decoding radius).
We therefore seem to have an issue with what we mean by "mightfail" in case of decoding beyond d/2. Perhaps "mighterror" is a good choice, to mirror the "decoding failure/decoding error" terminology of coding theory.
Yes, a list should be made of the "common" types (all we have used so far) of decoders, and an explanation. In the docstring of decoder_type
, there should be a clear explanation on how to look up what they mean. (another ticket of course). Perhaps even in the docstring for decoder_type
, if the explanations can be made short enough, since the function is always the original Decoder.decoder_type
.
Best, Johan
comment:28 followup: ↓ 29 Changed 6 years ago by
Okay, thanks!
Yes, a list should be made of the "common" types (all we have used so far) of decoders, and an explanation. In the docstring of decoder_type, there should be a clear explanation on how to look up what they mean. (another ticket of course). Perhaps even in the docstring for decoder_type, if the explanations can be made short enough, since the function is always the original Decoder.decoder_type.
Mmhh, I'm not sure i'll be able to keep it short, especially as it might grow when we will implement new decoders. I'll give it a try, and if it's too long, I will add an extra paragraph dedicated to decoders and types in #19897.
comment:29 in reply to: ↑ 28 Changed 6 years ago by
Mmhh, I'm not sure i'll be able to keep it short, especially as it might grow when we will implement new decoders. I'll give it a try, and if it's too long, I will add an extra paragraph dedicated to decoders and types in #19897.
Sure, the *list* will not be short. But each individual description might be. Something like: complete: decodes any vector in the ambient space
. dynamic: the decoder's type will depend on its input parameters
. unique: returns a single, closest codeword
. Nicely formatted in a table :P
comment:30 followup: ↓ 31 Changed 6 years ago by
Well, I just updated #19897 (thematic tutorial) to put the list of types in there and saw your answer just after that...
If you think I should move it to decoder_type
, I can!
David
comment:31 in reply to: ↑ 30 Changed 6 years ago by
comment:32 Changed 6 years ago by
 Commit changed from 4d70e2c87e7c2e50171ee24ca7c6200ff29d0250 to 590d18ccf6fdecd4d86aeb7e7d95c9d7a2502b4f
Branch pushed to git repo; I updated commit sha1. New commits:
590d18c  Changed decoder types

comment:33 Changed 6 years ago by
It's changed in #19897
I changed the terrible type name by mighterror
as suggested. I also removed alwayssucceed
from the default set of types, as in the case mighterror
occurs it cannot always succeed...
It's now dynamically added to the relevant cases.
And I replaced all underscores by hyphen in type names for consistency purposes.
David
New commits:
590d18c  Changed decoder types

comment:34 Changed 6 years ago by
 Branch changed from u/dlucas/generic_decoders to u/jsrn/generic_decoders
comment:35 Changed 6 years ago by
 Commit changed from 590d18ccf6fdecd4d86aeb7e7d95c9d7a2502b4f to 8a52ca1f60a52176fd588f76e4658ad549c7f249
 Status changed from needs_review to needs_work
OK, so I looked through the new version of the ticket, and I took the liberty of patching various things directly.
 I think the semantics that
decoding_radius
returnsd/2
when you decode way beyondd/2
makes no sense. I've changeddecoding_radius
to have same semantics asmaximum_error_weight
, and I've removedself._decoding_radius
.
 I changed a number of docs and elaborated on the description of the decoder. I added a note that it's exponential time. I didn't check typesetting of the compiled docs, sorry.
 I renamed
self.lookup_table
toself.syndrome_table
. Hope you agree with me.
 I fixed how the types are dynamically assigned. It was done weirdly before.
 I fixed several bugs in your bynow quite overworked
_build_lookup_table
. There were bugs related to both the minimum distance and covering radius, leading to catastrophic behaviour in the final decoder. Firstly, since you hadn't added the zerosyndrome to your table, both early termination and determination of minimum distance was buggy. I added this to the table. Secondly, your early termination didn't actually break out of the loop. Consequently, you never terminated early and you usually did not correctly determine the covering radius. This was blatantly obvious in the doctest you used everywhere, since the covering radius for that code is actually 1. This leaves all the doctests currently failing. I didn't fix it properly since I realised that a better example is probably called for, and I didn't have time to do that myself now.
 Probably the above calls for some doctests that actually tests the minimumdistance determining and coveringradius determining behaviour. I leave it to you to be the judge of that.
Please, try to test your code and sanity check the output before you put it up for review. This is getting bothersome...
New commits:
510cf23  Merge branch 'u/dlucas/generic_decoders' of git://trac.sagemath.org/sage into 19623_syndrome_decoder

f2b8714  Rm if that will never happen

3158728  Modifications to doc strings

7e8473c  Fix decoder type handling and corner case wrt. allzerosyndrome

88f1824  Fix the decoding radius in examples

8a52ca1  Actually early terminate in the early termination

comment:36 Changed 6 years ago by
 Branch changed from u/jsrn/generic_decoders to u/dlucas/generic_decoders
comment:37 Changed 6 years ago by
 Commit changed from 8a52ca1f60a52176fd588f76e4658ad549c7f249 to 60ee58d759de7d680415e896e54088ce3b4baf6c
I changed the doctests to something better.
I also added a whole paragraph to explain how the dynamic types worked, illustrating the creation of three syndrome decoders over the same code, one with maximum_error_weight
less than both the covering radius and half the minimum distance, one with maximum_error_rate
between the two values, and one with maximum_error_rate
bigger than the two values.
Finally, I also changed union
by add
when adding a type to _decoder_type
, because for some reason, union
was doing nothing.
New commits:
60ee58d  Better doctests, changed union to add when adding types to _decoder_type

comment:38 Changed 6 years ago by
 Branch changed from u/dlucas/generic_decoders to u/jsrn/generic_decoders
comment:39 Changed 6 years ago by
 Commit changed from 60ee58d759de7d680415e896e54088ce3b4baf6c to 1e738f3bb1fce15e840015d4f1909cef41909e8a
Branch pushed to git repo; I updated commit sha1. New commits:
1e738f3  Some docstring typography

comment:40 Changed 6 years ago by
 Reviewers changed from Julien Lavauzelle to Julien Lavauzelle, Johan Sebastian Rosenkilde Nielsen
 Status changed from needs_work to needs_review
OK, looks good. I fixed some phrasings and some typos in the docstring. If you can accept them, set the ticket to positive review :)
comment:41 Changed 6 years ago by
 Status changed from needs_review to positive_review
I accept your changes, positive reviewed!
David
comment:42 Changed 6 years ago by
Before you close the ticket (it is not long enough ;)), just a minor remark: your documentation of decode_to_code
talks about decoding into the "message space" instead of the code.
Otherwise, really good job :)
comment:43 Changed 6 years ago by
 Branch changed from u/jsrn/generic_decoders to u/dlucas/generic_decoders
comment:44 Changed 6 years ago by
 Commit changed from 1e738f3bb1fce15e840015d4f1909cef41909e8a to 642dca41fd2aa2621d0756e17282de62fcfb0620
Dammit, you're right, I thought I fixed this long ago... Well, nice catch, it's fixed now ;)
New commits:
642dca4  Fixed typo in decode_to_code

comment:45 Changed 6 years ago by
 Status changed from positive_review to needs_work
Something is broken in combination with #19897
sage t long src/doc/en/thematic_tutorials/coding_theory.rst ********************************************************************** File "src/doc/en/thematic_tutorials/coding_theory.rst", line 253, in doc.en.thematic_tutorials.coding_theory Failed example: c_dec = C.decode_to_code(r) Exception raised: Traceback (most recent call last): File "/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 496, in _run self.compile_and_execute(example, compiler, test.globs) File "/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/doctest/forker.py", line 858, in compile_and_execute exec(compiled, globs) File "<doctest doc.en.thematic_tutorials.coding_theory[29]>", line 1, in <module> c_dec = C.decode_to_code(r) File "/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/coding/linear_code.py", line 1656, in decode_to_code D = self.decoder(decoder_name, **kwargs) File "sage/misc/cachefunc.pyx", line 1909, in sage.misc.cachefunc.CachedMethodCaller.__call__ (/home/buildslavesage/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:9941) w = self._instance_call(*args, **kwds) File "sage/misc/cachefunc.pyx", line 1785, in sage.misc.cachefunc.CachedMethodCaller._instance_call (/home/buildslavesage/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:9408) return self.f(self._instance, *args, **kwds) File "/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/coding/linear_code.py", line 1739, in decoder D = decClass(self, **kwargs) File "/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/coding/linear_code.py", line 4173, in __init__ self._lookup_table = self._build_lookup_table() File "sage/misc/cachefunc.pyx", line 2235, in sage.misc.cachefunc.CachedMethodCallerNoArgs.__call__ (/home/buildslavesage/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:12406) self.cache = f(self._instance) File "/home/buildslavesage/slave/sage_git/build/local/lib/python2.7/sitepackages/sage/coding/linear_code.py", line 4298, in _build_lookup_table lookup[s] = copy(e) MemoryError **********************************************************************
comment:46 Changed 6 years ago by
 Commit changed from 642dca41fd2aa2621d0756e17282de62fcfb0620 to 2bef528d821af2b822fbca47d4f4928f2b0131ad
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
cb65314  Update to latest beta

2057fee  Small fixes and improvements

d9bcb28  Update to 7.1beta0

70e5211  Added ..linkall keyword and wrote a new paragraph on decoders and introspections

4035292  Moved and fixed list of decoder types

b443718  Update to 7.1beta1

61d49cc  Rewrote some sentences and fixed typos

12bf512  Removed table of types

564dcdd  Merge branch 'introductory_thematic_tutorial' into generic_decoders

2bef528  Picked a smaller code in thematic tutorial to fix examples

comment:47 Changed 6 years ago by
 Status changed from needs_work to needs_review
I identified the problem, it was because of a [12, 4] GRS code over GF(13)
in my examples.
Decoding over this code using the new syndrome decoder required to build a huge syndrome lookup table, hence the memory error.
I replaced it by a [6, 3] code over GF(7)
, so it should be fine now.
Reopening for review...
David
comment:48 Changed 6 years ago by
 Status changed from needs_review to positive_review
Hah, that's pretty funny: before, the syndrome decoder was actually a nearest neighbor decoder, and therefore ran in q^k
. Now, it's really a syndrome decoder and runs in q^(nk)
, which is too big for the previous example in the tutorial.
Perhaps, a linear code should default to using syndrome or nearest neighbor depending on which of q^(nk)
or q^k
is biggest?
For this ticket: green again.
comment:49 Changed 6 years ago by
 Branch changed from u/dlucas/generic_decoders to 2bef528d821af2b822fbca47d4f4928f2b0131ad
 Resolution set to fixed
 Status changed from positive_review to closed
I replaced the old "syndrome" decoder by a new implementation which creates a lookup table for syndromes.
This ticket is now open for review.
New commits:
WIP
Fixed bug in static error rate channel
Merge branch 'fix_in_static_error_rate_channel' into generic_decoders
Working syndrome decoder
Updated to latest beta
Cleaned code and fixed broken doctests