Opened 6 years ago

Closed 6 years ago

#19623 closed enhancement (fixed)

Syndrome decoder is not a syndrome decoder

Reported by: dlucas Owned by:
Priority: major Milestone: sage-7.1
Component: coding theory Keywords:
Cc: jsrn, cpernet, jlavauzelle Merged in:
Authors: David Lucas Reviewers: Julien Lavauzelle, Johan Sebastian Rosenkilde Nielsen
Report Upstream: N/A Work issues:
Branch: 2bef528 (Commits, GitHub, GitLab) Commit: 2bef528d821af2b822fbca47d4f4928f2b0131ad
Dependencies: Stopgaps:

Status badges

Description

The current implementation of syndrome decoder for linear codes actually uses a nearest neighbor decoding algorithm.

It should be rewritten.

Change History (49)

comment:1 Changed 6 years ago by dlucas

  • Branch set to u/dlucas/generic_decoders

comment:2 Changed 6 years ago by dlucas

  • Commit set to c834e3c48bf99a86762f61434190ec8aba202db4
  • Milestone changed from sage-6.10 to sage-7.0
  • Status changed from new to needs_review

I replaced the old "syndrome" decoder by a new implementation which creates a lookup table for syndromes.

This ticket is now open for review.


New commits:

2e5785cWIP
634b762Fixed bug in static error rate channel
945a333Merge branch 'fix_in_static_error_rate_channel' into generic_decoders
84bf28fWorking syndrome decoder
5584abeUpdated to latest beta
c834e3cCleaned code and fixed broken doctests

comment:3 Changed 6 years ago by git

  • Commit changed from c834e3c48bf99a86762f61434190ec8aba202db4 to ac985d8e5f09b21465e3b805f406e8b4014a5a5e

Branch pushed to git repo; I updated commit sha1. New commits:

ac985d8Update to 7.0rc1

comment:4 Changed 6 years ago by dlucas

  • Authors set to David Lucas

I updated this ticket to the latest beta. This is still open for review.

comment:5 Changed 6 years ago by git

  • Commit changed from ac985d8e5f09b21465e3b805f406e8b4014a5a5e to fc79446d6948143d32a9065410453379c79ef2b2

Branch pushed to git repo; I updated commit sha1. New commits:

ed46ce8Update to 7.0
b7f0688Fixed check related to lexicographic order
25bb241Added early stop condition
fc79446Added checks over the input

comment:6 Changed 6 years ago by dlucas

I fixed a few bugs:

  • the check related to lexicographic order in _build_lookup_table was wrong, as it ignored the weight of the vectors. Now the vector with the smallest weight is kept, and if both vectors have the same weight, the smallest in lexicographic order is kept.
  • I added an early termination condition on the loop to fill the table, which triggers if no vector was added to the table during a full iteration.
  • I also added an early termination to decode_to_code: if received vector's syndrome equals to zero, it does not contain any error and thus can be returned as is.
  • Finally, I added checks over the input, especially, it is no longer possible to take number_errors > code.length().

This is still open for review,

David

comment:7 Changed 6 years ago by jsrn

Quick remark

  • Since the covering radius of a linear code is always at most n-k, does it make any sense to let the table search beyond that?
  • Did you prove that if there are no coset leaders of weight t, then there are no coset leaders of t+1? (i.e. that your early termination is sound)

comment:8 Changed 6 years ago by dlucas

Since the covering radius of a linear code is always at most n-k, does it make any sense to let the table search beyond that?

Not really. I'll change this.

Did you prove that if there are no coset leaders of weight t, then there are no coset leaders of t+1? (i.e. that your early termination is sound)

It comes from theorem 1.12.6, prop. 5 (page 51) in Huffman and Pless

Theorem 1.12.6 Let C be an [n, k] code over F q . Let C* a code obtained from C by puncturing on some coordinate. The following hold:

(v) Assume that x is a coset leader of C. If x' ∈ (F_q)^n all of whose nonzero components agree with the same components of x, then x' is also a coset leader of C. In particular, if there is a coset of weight s, there is also a coset of any weight less than s.

Last edited 6 years ago by dlucas (previous) (diff)

comment:9 follow-ups: Changed 6 years ago by jlavauzelle

  • Cc jlavauzelle added
  • Reviewers set to Julien Lavauzelle
  • Status changed from needs_review to needs_work

I add some new remarks:

  • I think you could remove your _list_all_error_values method. For each step of the main loop (i.e. for a given error weight i), you only need Permutations(l, i), with l equals to i times the list of non-zero elements of GF(q). Thus you can save a lot of memory, using the both iterators on Permutations and on Subsets.
  • I'm not completely sure of what I'm saying now, but you can notice that the ordering induced by the "composition" of these iterators respects the one you want for updating the table (i.e. first hamming weight, then lexicographic order). So in theory, you might never enter these blocks:
    if (e.hamming_weight() == e_cur.hamming_weight()
            and e < e_cur):
        stop = False
        lookup[s] = copy(e)
    elif e.hamming_weight() < e_cur.hamming_weight():
        stop = False
        lookup[s] = copy(e)
    
  • In decode_to_code(), you talk about decoding into the message space, but I think you meant into the code instead. In the same function, you may handle the case s.is_zero() before building the lookup table.
  • I don't really understand the difference between number_errors() and decoding_radius(). Is it a tight bound on the maximum number of errors the decoder can decode ? Or just a way to remember what the user has set in the number_errors parameter of the constructor ? Maybe you could make them more explicit, like error_max_weight.
  • I also think that your description of the decoder ("correcting up to number_errors errors") could fool the user, as number_errors may be set up to n whereas the decoder will never decode an error of weight n.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 6 years ago by jsrn

Replying to jlavauzelle:

  • I also think that your description of the decoder ("correcting up to number_errors errors") could fool the user, as number_errors may be set up to n whereas the decoder will never decode an error of weight n.

Indeed, we seem to have an issue with what decoding_radius should mean. Should it here be min(number_errors, halft-min-dist), or should it just be number_errors, with the understanding that it chooses the nearest neighbour?

This should also be reflected in the decoder_type. (btw. is this even set for this decoder?!).

comment:11 in reply to: ↑ 9 ; follow-up: Changed 6 years ago by dlucas

Replying to jlavauzelle:

I add some new remarks:

  • I think you could remove your _list_all_error_values method. For each step of the main loop (i.e. for a given error weight i), you only need Permutations(l, i), with l equals to i times the list of non-zero elements of GF(q). Thus you can save a lot of memory, using the both iterators on Permutations and on Subsets.

True enough. I'll made the changes.

  • I'm not completely sure of what I'm saying now, but you can notice that the ordering induced by the "composition" of these iterators respects the one you want for updating the table (i.e. first hamming weight, then lexicographic order). So in theory, you might never enter these blocks:
    if (e.hamming_weight() == e_cur.hamming_weight()
            and e < e_cur):
        stop = False
        lookup[s] = copy(e)
    elif e.hamming_weight() < e_cur.hamming_weight():
        stop = False
        lookup[s] = copy(e)
    

I have the same feeling, I'll perform some tests to verify this (empirically), and I will remove these checks if they prove themselves useless.

  • In decode_to_code(), you talk about decoding into the message space, but I think you meant into the code instead. In the same function, you may handle the case s.is_zero() before building the lookup table.

Indeed, this is a copy-paste mistake. Regarding the handling of is_zero, I'm not sure that's the best behaviour. I somehow feel the user still wants the table to be built, as, after all, the user will use this decoder to actually correct errors. So, even if it's not used when decoding a vector with a zero syndrome, I still feel it should be built on the first decoding attempt, whichever the syndrome of the word to decode. I'm not strongly for this solution though, so if you think it's a bad idea, I'm ok to drop it.

  • I don't really understand the difference between number_errors() and decoding_radius(). Is it a tight bound on the maximum number of errors the decoder can decode ? Or just a way to remember what the user has set in the number_errors parameter of the constructor ? Maybe you could make them more explicit, like error_max_weight.

Well, it's more a design choice that a question of bounds here, as number_errors is a proper getter method for the number of errors chosen at construction time, while decoding_radius is a method we have for every decoder (got from the abstract class Decoder) and made for the user to catch the maximal number of errors the user's decoder can decode. So, the user should not use number_errors, it's more the internal getter to avoid ugly direct calls like self._number_errors. But you're right, it's somehow confusing... What about changing number_errors into a private method?

  • I also think that your description of the decoder ("correcting up to number_errors errors") could fool the user, as number_errors may be set up to n whereas the decoder will never decode an error of weight n.

Indeed. That's why I added a note regarding this issue in the class' documentation. Rereading it, I should make it more explicit though, as it does not state clearly that if you go further a certain number of errors t, you probably won't get the original codeword in which you got t errors.

Regarding the descriptive message, well, I just wanted it to remind all the input parameters the user set. I'll try to find a better way to say it.

David

comment:12 in reply to: ↑ 10 ; follow-up: Changed 6 years ago by dlucas

Replying to jsrn:

Replying to jlavauzelle:

  • I also think that your description of the decoder ("correcting up to number_errors errors") could fool the user, as number_errors may be set up to n whereas the decoder will never decode an error of weight n.

Indeed, we seem to have an issue with what decoding_radius should mean. Should it here be min(number_errors, halft-min-dist), or should it just be number_errors, with the understanding that it chooses the nearest neighbour?

Forget my previous answer, I vote for your solution : number errors returns _number_errors as it was set at construction time, and decoding_radius returns min(number_errors, half-min-dist)

This should also be reflected in the decoder_type. (btw. is this even set for this decoder?!).

Yes it is, for now it's:

LinearCodeSyndromeDecoder._decoder_type = {"hard-decision", "unique", "always-succeed", "complete"}

It's not shown in the diff because it was set like this before, and I did not change it.

comment:13 in reply to: ↑ 11 Changed 6 years ago by jsrn

Replying to dlucas:

  • I'm not completely sure of what I'm saying now, but you can notice that the ordering induced by the "composition" of these iterators respects the one you want for updating the table (i.e. first hamming weight, then lexicographic order). So in theory, you might never enter these blocks:
    if (e.hamming_weight() == e_cur.hamming_weight()
            and e < e_cur):
        stop = False
        lookup[s] = copy(e)
    elif e.hamming_weight() < e_cur.hamming_weight():
        stop = False
        lookup[s] = copy(e)
    

I have the same feeling, I'll perform some tests to verify this (empirically), and I will remove these checks if they prove themselves useless.

It's true, e_cur will always be less than e in the lexicographical order (including having lower weight).

I have a question: why is that loop written in this bizarre way? I mean, wy not do a for nerrors in range(1, t+1): and then call break within the loop if you need to jump out early?

I agree with Julian that _list_all_error_values is memory-wise inefficient. But furthermore, I don't understand your use of Permutation? What you need is cartesian_product.

I think your decode_to_code *modifies* the input vector without copying it. This is very bad!

Best, Johan

comment:14 in reply to: ↑ 12 Changed 6 years ago by jsrn

Replying to dlucas:

Replying to jsrn:

Replying to jlavauzelle:

  • I also think that your description of the decoder ("correcting up to number_errors errors") could fool the user, as number_errors may be set up to n whereas the decoder will never decode an error of weight n.

Indeed, we seem to have an issue with what decoding_radius should mean. Should it here be min(number_errors, halft-min-dist), or should it just be number_errors, with the understanding that it chooses the nearest neighbour?

Forget my previous answer, I vote for your solution : number errors returns _number_errors as it was set at construction time, and decoding_radius returns min(number_errors, half-min-dist)

Hmm, I'm not sure this is my solution ;-) There are decoding methods which decode up to, say t > d/2 errors, but which might fail with low probability (interleaved codes, power decoding, ao.). Should decoding_radius really return d/2 for all these? Perhaps it's the definition of decoding_radius which should change. In this case it's "the maximal number of errors a received word can have and for which the decoder will always return a most likely codeword".

Yes it is, for now it's:

LinearCodeSyndromeDecoder._decoder_type = {"hard-decision", "unique", "always-succeed", "complete"}

It's not shown in the diff because it was set like this before, and I did not change it.

OK, I missed it when browsing the local file then. "complete" should only be added if one sets number_errors at least the covering radius of the code.

comment:15 Changed 6 years ago by git

  • Commit changed from fc79446d6948143d32a9065410453379c79ef2b2 to b2b9e8270a95e7478760532827be04cecb906510

Branch pushed to git repo; I updated commit sha1. New commits:

f1b61afUpdate to 7.1beta0
026aa4dChanged methods and docstrings related to errors
0f2ed54Rewrote _list_all_error_values
23c06f6Rewrote _build_lookup_table
4e13857input is no longer modified in place in decode_to_code
b2b9e82Fixed broken doctests and did a few small tweaks

comment:16 follow-up: Changed 6 years ago by dlucas

Hello,

I changed the following in my code:

  • I renamed number_errors as maximum_error_weight. I changed accordingly the description given by representation methods _latex_ and _repr_. I also renamed its getter (from number_errors to maximum_error_weight), and changed its docstring.
  • covering_radius now returns min(maximum_error_weight, half-min-dist). I changed its docstring and added a note to explain it performs a minimum distance computation, which can be long...
  • I rewrote _list_all_error_values, which now takes an input parameter weight and builds only the exhaustive list of error values of weight weight. It now uses cartesian_product instead of Permutation.
  • I rewrote _build_lookup_table, which now uses a for loop. I removed the useless lexicographic order check. It still terminates early if no new pattern is added during a loop.
  • The input of decode_to_code is now copied before modification.
  • Finally, I fixed a few broken doctests, and removed the keyword complete in decoder_type as computing the covering radius can only be done by a call to a method forcing the use of optional library Guava. Ticket #19913 proposes a new covering_radius method which has a Python alternative to Guava. Once this ticket (or #19913) gets reviewed, I'll do the appropriate updates wrt. covering radius in the remaining one.

Sorry about this poorly written code, it should be better now.

David

comment:17 in reply to: ↑ 16 Changed 6 years ago by jsrn

Hi

  • covering_radius now returns min(maximum_error_weight, half-min-dist). I changed its docstring and added a note to explain it performs a minimum distance computation, which can be long...

You mean decoding_radius I'm sure :-) But you completely misunderstood what I wrote before, I'm afraid. By your updated docstring, then clearly decoding_radius should return maximum_error_weight.

I just had a pretty cool idea: When building the lookup-table you're actually *computing the minimum distance*! If t is the lowest weight at which you detect a collision, i.e. see a syndrome that has already occurred before, then the minimum distance is 2t or 2t-1 (you can determine which it was by inspecting the weight of the errors that collided). In any case, "half the minimum distance decoding" will be t errors.

If maximum_error_weight < t then _build_lookup_table will never see a collision, and so it can *know* that the minimum distance is greater. Hence, it can know it is a bounded distance decoder. If maximum_error_weight = t, then a collision will be seen at this weight, and so the decoder can know it is a minimum distance decoder. If maximum_error_weight > t, then it knows that it will sometimes fail, but it will mostly decode up to min(maximum_error_weight, covering_radius) (see below). There should be some appropriate keywords in type distinguishing these cases.

Once the infrastructure we talked about previously is in place, the Code could be informed of this minimum distance.

In the same vein, you're *also* computing the covering radius. For if w+1 is the first weight where *every* error weight collides with a lower-weight error, then w is the covering radius.

Now, if maximum_error_weight < w, then _build_lookup_table will never see this event, and so it clearly doesn't determine the covering radius. In that case, maximum_error_weight is truly the decoding radius (except that if maximum_error_weight >= d/2$ then it might sometime return another minimum weight error, and not the one that was added).

If maximum_error_weight > w, then we *know* that incorrect decoding will be performed for *all* error patterns of weight > w. So in maximum_error_weight() and the string representations, and elsewhere, should we really return the user-supplied maximum_error_weight and not w? It's not terribly important to me, except that it should be possible for the user to retrieve w (possibly by informing Code of this covering radius and the user then calls Code.covering_radius). I would, though, posit that in case maximum_error_weight is used instead of w, then maximum_error_weight should be capped at n-k at construction time. Higher than that clearly makes no sense.

  • I rewrote _list_all_error_values, which now takes an input parameter weight and builds only the exhaustive list of error values of weight weight. It now uses cartesian_product instead of Permutation.

There's no need to return a list. Just return the cartesian_product object. And in the inner for-loop of _build_lookup_table, instead of looping over j, do a for error in error_values_of_weight[t] or something.

In Python, a list is concretely instantiated and takes space in memory. A generating_expression takes next to no memory. When you can, use generating_expression.

We're down to aesthetics now, but wouldn't it be nicer to inline _list_all_error_values so you don't need to construct l time and again? You could e.g. make a generating expression which generates the generating expressions:

   error_position_tables = [ cartesian_product([l]*weight) for weight in range(maximum_error_weight) ]
  • Finally, I fixed a few broken doctests, and removed the keyword complete in decoder_type as computing the covering radius can only be done by a call to a method forcing the use of optional library Guava. Ticket #19913 proposes a new covering_radius method which has a Python alternative to Guava. Once this ticket (or #19913) gets reviewed, I'll do the appropriate updates wrt. covering radius in the remaining one.

By my arguments above, you can add complete whenever the covering radius was determined during _build_lookup_table.

Also, I just noticed that __eq__ is wrong: it doesn't compare maximum_error_weight.

Best, Johan

comment:18 Changed 6 years ago by dlucas

Thank you for your comments.

But now another problem arises, namely how to design a generic mechanism to inform codes about some of their "properties".

Consider you have a code and you perform syndrome decoding.

If you already computed its minimum distance, you can fill decoder_type at construction time. If you did not, you will fill it while building the table of syndromes and you have to inform the code that it now knows its minimum distance. This can be done for instances of LinearCode as they have a class argument minimum_distance but how to do that properly for other codes (e.g. GeneralizedReedSolomonCode) that do not possess this class parameter?

The same issue arises with covering radius, but for any code this time, as there is no class with a _covering_radius parameter.

I might again be completely missing the spot here, but it does not appear to me as a simple mechanism to implement, especially if we want something generic. I'm working on it, so for now I did not change maximum_error_weight nor decoding_radius.

All the other requested changes has been made. Leaving this in needs_work for now.

Last edited 6 years ago by dlucas (previous) (diff)

comment:19 Changed 6 years ago by git

  • Commit changed from b2b9e8270a95e7478760532827be04cecb906510 to 54c02c3af22e896dd03759408b989f493415f26a

Branch pushed to git repo; I updated commit sha1. New commits:

d892ad4Fixed __eq__
54c02c3Killed _list_all_error_values and replaced it by an inline expression

comment:20 Changed 6 years ago by jsrn

Hi David,

I agree, David, that the "informing the code"-feature is advanced and shouldn't be done in this ticket. Sorry for not making that clear.

So, firstly, I want to say that my comments actually make sense even as a completely internal mechanism of SyndromeDecoder. That is, harvesting the true minimum distance and covering radius while the lookup-table is being build enables the decoder to give useful feedback to the user, such as its decoding radius and a more precise decoder type.

Secondly, I hope you realise that this is "just" yet another example of the usefulness of a general mechanism for specifying upper- and lower-bounds on the minimum distance of a code, and it shows that, indeed as could have been expected, that minimum distance is *not* special in this regard: there are other parameters for which me might want such a mechanism (in this case, covering radius).

To be more precise, say we design a general system for informing a LinearCode that new knowledge has been obtained on a property it has. It could be something like (and now I'm totally making this up on the spot):

   `LinearCode._update_property_bound(property_name, update_type, value)

With such a mechanism, SyndromeDecoder would call

   `self._code._update_property_bound("minimum_distance", "upper", t)
   `self._code._update_property_bound("minimum_distance", "lower", t)

or simply

   `self._code._update_property_bound("minimum_distance", "exact", t)

Then the function minimum_distance should be changed accordingly. And various functions for computing or bounding the minimum distance should also be changed accordingly.

As we have discussed before, we should be very careful not completely over-engineering this. I think it might be useful to start compiling a list of examples where such bound computation is possible and could be useful.

comment:21 follow-up: Changed 6 years ago by dlucas

So, firstly, I want to say that my comments actually make sense even as a completely internal mechanism of SyndromeDecoder?. That is, harvesting the true minimum distance and covering radius while the lookup-table is being build enables the decoder to give useful feedback to the user, such as its decoding radius and a more precise decoder type.

Of course they are, I'm perfectly aware of that! I was not criticising the usefulness of it, I was just saying it seems quite difficult and a bit out of purpose in this ticket.

Secondly, I hope you realise that this is "just" yet another example of the usefulness of a general mechanism for specifying upper- and lower-bounds on the minimum distance of a code, and it shows that, indeed as could have been expected, that minimum distance is *not* special in this regard: there are other parameters for which me might want such a mechanism (in this case, covering radius).

Definitely. dimension should be added to this list of lower- and upper-bounds... I realized that while working on shortened codes (and subfield subcodes as well).

For the mechanism, why not working on a set of decorators? Or maybe a generic decorator with arguments, like bound which can be set to lower, upper or exact and an argument parameter, which can be set to minimum_distance, dimension or covering_radius? Once properly implemented, it should be easy to use, and very flexible (adding new parameters would be very simple).

Note it's just an idea in the wind, I'm definitely no expert in decorators, I just read the documentation in the bus 20 min ago ;)But it seems feasible. Anyway, I opened a follow-up ticket #19959.

Version 0, edited 6 years ago by dlucas (next)

comment:22 in reply to: ↑ 21 Changed 6 years ago by jsrn

Of course they are, I'm perfectly aware of that! I was not criticising the usefulness of it, I was just saying it seems quite difficult and a bit out of purpose in this ticket.

You were saying a lot on how the code-parameter mechanism had issues and was difficult, and possibly implied that it was out of place. You didn't comment on the fact that my suggestions still made sense for SyndromeDecoder completely isolatedly.

But I do realise that this ended up being a larger kettle of fish than we originally intended. In case you're against implementing the decoding radius and covering radius thing, *internally in SyndromeDecoder?* for now, I would say the only sensible semantics/documentation are:

  • decoding_radius returns maximum_error_weight. It's main docstring remains as it is, while the Note is, of course, removed. Possibly add a Note saying something like "When correcting beyond half the minimum distance, correct unique decoding is not always possible. This decoder returns a codeword of minimum distance to the received. In particular, if maximum_error_weight is beyond the covering radius of the code, decoding of this many errors is guaranteed to be incorrect.".
  • maximum_error_weight is capped at n-k in the constructor.

On an unrelated note, I also noticed: The docstring of SyndromeDecoder should explain in 2-3 lines what the decoder does. There's a "a" missing before "syndrome" in the first line.

Definitely. dimension should be added to this list of lower- and upper-bounds...

Yes, I remember now.

For the mechanism, why not working on a set of decorators? Or maybe a generic decorator with arguments, like bound which can be set to lower, upper or exact and an argument parameter, which can be set to minimum_distance, dimension or covering_radius? Once properly implemented, it should be easy to use, and very flexible (adding new parameters would be very simple).

Note it's just an idea in the wind, I'm definitely no expert in decorators, I just read the documentation in the bus 20 min ago ;)But it seems feasible. Anyway, I opened a follow-up ticket #19959.

Hmm, yes it's an interesting idea. I could imagine that the decorator be backed by some functions that should be callable directly. What we discussed on this ticket would likely use those functions and not a decorator (since SyndromeDecoder is another object than the code itself, and since _build_lookup_table has the potential to discover 0, 1 or 2 parameters of the Code.

Best, Johan

comment:23 Changed 6 years ago by dlucas

Sorry if wrote my answer in a confusing manner, I'm not against implementing the mechanism to discover the minimum distance and the covering radius and dynamically update decoder type, I was against implementing the generic mechanism here. Your suggestions definitely make sense here, I just jumped directly into the bigger picture instead of focusing on this instance of the problem. My bad! I think we actually agree on this. I'll do that in the morrow.

Hmm, yes it's an interesting idea. I could imagine that the decorator be backed by some functions that should be callable directly.

That's what I have in mind. I read more carefully the documentation, it seems interesting to do, and definitely possible. The reason I thought about decorators in the first place is because, we, as usual, want a mechanism which has the best genericity/simplicity ratio. And with a decorator, if one wants to implement his own methods, one only has to add a decorator over some specific methods, which is definitely simple! Well, if you (or Julien, or anyone) can see a better mechanism, please tell me!

David

comment:24 Changed 6 years ago by dlucas

Another question:

For now, _build_lookup_table is called on the first decoding attempt, and not at construction time. I made this choice, because, as a user, I prefer objects I use to be constructed quickly, and then spend some time when calling methods over this objects rather than waiting at construction time. It's a matter of taste I guess. Also, I might be working in a very strange way for which I need to construct decoders but not using their decoding methods. From this point of view, the faster the better.

Anyway, here we have the following behaviour:

  • I build my decoder.
  • If I call directly maximum_error_weight(), decoding_radius() or representation methods, it will return me results based on either the value I chose for maximum_error_weight or n-k depending on which is the lowest.
  • Then I perform some decoding, call the methods I listed above again, and their output might have change... The same occurs with decoder_type().

From the user point of view, it can be quite confusing (imho). So, wouldn't it be better to build the lookup table at construction time, so "internal" minimum_distance and covering_radius might be found (if possible) and results updated accordingly?

In that configuration, whenever I call representations methods, decoder_type and so on, their output will remain the same.

What do you think?

comment:25 Changed 6 years ago by git

  • Commit changed from 54c02c3af22e896dd03759408b989f493415f26a to 4d70e2c87e7c2e50171ee24ca7c6200ff29d0250

Branch pushed to git repo; I updated commit sha1. New commits:

4d70e2cImplemented internal discovery of code parameters and auto-update of decoder types

comment:26 follow-up: Changed 6 years ago by dlucas

  • Milestone changed from sage-7.0 to sage-7.1
  • Status changed from needs_work to needs_review

Hello,

I did the required changes:

  • covering_radius and minimum_distance for the code are now internally saved whenever found,
  • decoder_type is updated accordingly, decoding_radius and maximum_error_weight likewise.
  • The lookup table is now computed at construction time to solve the problem presented in comment 24.
  • I also added "dynamic" to the list of types for this decoder.

I made some naming choices for the new decoder types here, which I like you to discuss, especially "might_create_decoding_error" which is awful. I did not have a better idea, sorry...

I commented my code as best as possible, because the chain of if statements at the end of _build_lookup_table is pretty ugly. I chose to not write getter methods for the internal minimum_distance and covering_radius, because depending on how far it went into the computation, it can fill these fields - or not. I preferred these to be properly filled when #19959 will be ready instead of implementing something clumsy. But I think this is somehow bizarre, as we might get useful information and not pass it to the user. This choice is of course open to discussion.

By the way, this question is more or less related, does anyone remember what a complete decoder is? I cannot remember this... Which once again proves I need to write a proper list of these types in the introductory thematic tutorial.

Best,

David

comment:27 in reply to: ↑ 26 Changed 6 years ago by jsrn

Hi,

Sounds cool! I'll take a closer look later on.

I made some naming choices for the new decoder types here, which I like you to discuss, especially "might_create_decoding_error" which is awful. I did not have a better idea, sorry...

By the way, this question is more or less related, does anyone remember what a complete decoder is? I cannot remember this... Which once again proves I need to write a proper list of these types in the introductory thematic tutorial.

Yeah, might_create_decoding_error is pretty bad :-) complete refers to the fact that it decodes every word in the ambient space. It's more or less a synonym of one of the meanings of "maximum-likelihood", I guess, whenever the decoder is also unique. I.e. Syndrome decoder is complete when the decoding radius is the covering radius.

There is a type-discussion on this Issue on our bitbucket repo: https://bitbucket.org/lucasdavid/sage_coding_project/issues/40/decoder-introspection

The decoder you're doing now is similar to Power decoding of RS codes (which I gave types hard-decision, unique, might-fail): both decoders give a unique answer and return a closest codeword. However, Power decoding "might fail" in the sense that it doesn't give *any answer at all*. While SyndromeDecoder? "might fail" in the sense that it gives you *another* close codeword than what you sent. As opposed to this, Guruswami-Sudan "always succeeds" in the sense that *your* codeword is guaranteed to be on the list of outputs (up to the decoding radius).

We therefore seem to have an issue with what we mean by "might-fail" in case of decoding beyond d/2. Perhaps "might-error" is a good choice, to mirror the "decoding failure/decoding error" terminology of coding theory.

Yes, a list should be made of the "common" types (all we have used so far) of decoders, and an explanation. In the doc-string of decoder_type, there should be a clear explanation on how to look up what they mean. (another ticket of course). Perhaps even in the doc-string for decoder_type, if the explanations can be made short enough, since the function is always the original Decoder.decoder_type.

Best, Johan

comment:28 follow-up: Changed 6 years ago by dlucas

Okay, thanks!

Yes, a list should be made of the "common" types (all we have used so far) of decoders, and an explanation. In the doc-string of decoder_type, there should be a clear explanation on how to look up what they mean. (another ticket of course). Perhaps even in the doc-string for decoder_type, if the explanations can be made short enough, since the function is always the original Decoder.decoder_type.

Mmhh, I'm not sure i'll be able to keep it short, especially as it might grow when we will implement new decoders. I'll give it a try, and if it's too long, I will add an extra paragraph dedicated to decoders and types in #19897.

comment:29 in reply to: ↑ 28 Changed 6 years ago by jsrn

Mmhh, I'm not sure i'll be able to keep it short, especially as it might grow when we will implement new decoders. I'll give it a try, and if it's too long, I will add an extra paragraph dedicated to decoders and types in #19897.

Sure, the *list* will not be short. But each individual description might be. Something like: complete: decodes any vector in the ambient space. dynamic: the decoder's type will depend on its input parameters. unique: returns a single, closest codeword. Nicely formatted in a table :-P

comment:30 follow-up: Changed 6 years ago by dlucas

Well, I just updated #19897 (thematic tutorial) to put the list of types in there and saw your answer just after that...

If you think I should move it to decoder_type, I can!

David

comment:31 in reply to: ↑ 30 Changed 6 years ago by jsrn

Replying to dlucas:

Well, I just updated #19897 (thematic tutorial) to put the list of types in there and saw your answer just after that...

If you think I should move it to decoder_type, I can!

David

Yeah, I think that's better. Don't you? It's reference information, not tutorial information.

comment:32 Changed 6 years ago by git

  • Commit changed from 4d70e2c87e7c2e50171ee24ca7c6200ff29d0250 to 590d18ccf6fdecd4d86aeb7e7d95c9d7a2502b4f

Branch pushed to git repo; I updated commit sha1. New commits:

590d18cChanged decoder types

comment:33 Changed 6 years ago by dlucas

It's changed in #19897

I changed the terrible type name by might-error as suggested. I also removed always-succeed from the default set of types, as in the case might-error occurs it cannot always succeed... It's now dynamically added to the relevant cases. And I replaced all underscores by hyphen in type names for consistency purposes.

David


New commits:

590d18cChanged decoder types
Last edited 6 years ago by dlucas (previous) (diff)

comment:34 Changed 6 years ago by jsrn

  • Branch changed from u/dlucas/generic_decoders to u/jsrn/generic_decoders

comment:35 Changed 6 years ago by jsrn

  • Commit changed from 590d18ccf6fdecd4d86aeb7e7d95c9d7a2502b4f to 8a52ca1f60a52176fd588f76e4658ad549c7f249
  • Status changed from needs_review to needs_work

OK, so I looked through the new version of the ticket, and I took the liberty of patching various things directly.

  • I think the semantics that decoding_radius returns d/2 when you decode way beyond d/2 makes no sense. I've changed decoding_radius to have same semantics as maximum_error_weight, and I've removed self._decoding_radius.
  • I changed a number of docs and elaborated on the description of the decoder. I added a note that it's exponential time. I didn't check typesetting of the compiled docs, sorry.
  • I renamed self.lookup_table to self.syndrome_table. Hope you agree with me.
  • I fixed how the types are dynamically assigned. It was done weirdly before.
  • I fixed several bugs in your by-now quite overworked _build_lookup_table. There were bugs related to both the minimum distance and covering radius, leading to catastrophic behaviour in the final decoder. Firstly, since you hadn't added the zero-syndrome to your table, both early termination and determination of minimum distance was buggy. I added this to the table. Secondly, your early termination didn't actually break out of the loop. Consequently, you never terminated early and you usually did not correctly determine the covering radius. This was blatantly obvious in the doctest you used everywhere, since the covering radius for that code is actually 1. This leaves all the doctests currently failing. I didn't fix it properly since I realised that a better example is probably called for, and I didn't have time to do that myself now.
  • Probably the above calls for some doctests that actually tests the minimum-distance determining and covering-radius determining behaviour. I leave it to you to be the judge of that.

Please, try to test your code and sanity check the output before you put it up for review. This is getting bothersome...


New commits:

510cf23Merge branch 'u/dlucas/generic_decoders' of git://trac.sagemath.org/sage into 19623_syndrome_decoder
f2b8714Rm if that will never happen
3158728Modifications to doc strings
7e8473cFix decoder type handling and corner case wrt. all-zero-syndrome
88f1824Fix the decoding radius in examples
8a52ca1Actually early terminate in the early termination

comment:36 Changed 6 years ago by dlucas

  • Branch changed from u/jsrn/generic_decoders to u/dlucas/generic_decoders

comment:37 Changed 6 years ago by dlucas

  • Commit changed from 8a52ca1f60a52176fd588f76e4658ad549c7f249 to 60ee58d759de7d680415e896e54088ce3b4baf6c

I changed the doctests to something better. I also added a whole paragraph to explain how the dynamic types worked, illustrating the creation of three syndrome decoders over the same code, one with maximum_error_weight less than both the covering radius and half the minimum distance, one with maximum_error_rate between the two values, and one with maximum_error_rate bigger than the two values.

Finally, I also changed union by add when adding a type to _decoder_type, because for some reason, union was doing nothing.


New commits:

60ee58dBetter doctests, changed union to add when adding types to _decoder_type

comment:38 Changed 6 years ago by jsrn

  • Branch changed from u/dlucas/generic_decoders to u/jsrn/generic_decoders

comment:39 Changed 6 years ago by git

  • Commit changed from 60ee58d759de7d680415e896e54088ce3b4baf6c to 1e738f3bb1fce15e840015d4f1909cef41909e8a

Branch pushed to git repo; I updated commit sha1. New commits:

1e738f3Some docstring typography

comment:40 Changed 6 years ago by jsrn

  • Reviewers changed from Julien Lavauzelle to Julien Lavauzelle, Johan Sebastian Rosenkilde Nielsen
  • Status changed from needs_work to needs_review

OK, looks good. I fixed some phrasings and some typos in the docstring. If you can accept them, set the ticket to positive review :-)

comment:41 Changed 6 years ago by dlucas

  • Status changed from needs_review to positive_review

I accept your changes, positive reviewed!

David

comment:42 Changed 6 years ago by jlavauzelle

Before you close the ticket (it is not long enough ;-)), just a minor remark: your documentation of decode_to_code talks about decoding into the "message space" instead of the code.

Otherwise, really good job :)

comment:43 Changed 6 years ago by dlucas

  • Branch changed from u/jsrn/generic_decoders to u/dlucas/generic_decoders

comment:44 Changed 6 years ago by dlucas

  • Commit changed from 1e738f3bb1fce15e840015d4f1909cef41909e8a to 642dca41fd2aa2621d0756e17282de62fcfb0620

Dammit, you're right, I thought I fixed this long ago... Well, nice catch, it's fixed now ;)


New commits:

642dca4Fixed typo in decode_to_code

comment:45 Changed 6 years ago by vbraun

  • Status changed from positive_review to needs_work

Something is broken in combination with #19897

sage -t --long src/doc/en/thematic_tutorials/coding_theory.rst
**********************************************************************
File "src/doc/en/thematic_tutorials/coding_theory.rst", line 253, in doc.en.thematic_tutorials.coding_theory
Failed example:
    c_dec = C.decode_to_code(r)
Exception raised:
    Traceback (most recent call last):
      File "/home/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest doc.en.thematic_tutorials.coding_theory[29]>", line 1, in <module>
        c_dec = C.decode_to_code(r)
      File "/home/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/coding/linear_code.py", line 1656, in decode_to_code
        D = self.decoder(decoder_name, **kwargs)
      File "sage/misc/cachefunc.pyx", line 1909, in sage.misc.cachefunc.CachedMethodCaller.__call__ (/home/buildslave-sage/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:9941)
        w = self._instance_call(*args, **kwds)
      File "sage/misc/cachefunc.pyx", line 1785, in sage.misc.cachefunc.CachedMethodCaller._instance_call (/home/buildslave-sage/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:9408)
        return self.f(self._instance, *args, **kwds)
      File "/home/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/coding/linear_code.py", line 1739, in decoder
        D = decClass(self, **kwargs)
      File "/home/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/coding/linear_code.py", line 4173, in __init__
        self._lookup_table = self._build_lookup_table()
      File "sage/misc/cachefunc.pyx", line 2235, in sage.misc.cachefunc.CachedMethodCallerNoArgs.__call__ (/home/buildslave-sage/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:12406)
        self.cache = f(self._instance)
      File "/home/buildslave-sage/slave/sage_git/build/local/lib/python2.7/site-packages/sage/coding/linear_code.py", line 4298, in _build_lookup_table
        lookup[s] = copy(e)
    MemoryError
**********************************************************************

comment:46 Changed 6 years ago by git

  • Commit changed from 642dca41fd2aa2621d0756e17282de62fcfb0620 to 2bef528d821af2b822fbca47d4f4928f2b0131ad

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

cb65314Update to latest beta
2057feeSmall fixes and improvements
d9bcb28Update to 7.1beta0
70e5211Added ..linkall keyword and wrote a new paragraph on decoders and introspections
4035292Moved and fixed list of decoder types
b443718Update to 7.1beta1
61d49ccRewrote some sentences and fixed typos
12bf512Removed table of types
564dcddMerge branch 'introductory_thematic_tutorial' into generic_decoders
2bef528Picked a smaller code in thematic tutorial to fix examples

comment:47 Changed 6 years ago by dlucas

  • Status changed from needs_work to needs_review

I identified the problem, it was because of a [12, 4] GRS code over GF(13) in my examples. Decoding over this code using the new syndrome decoder required to build a huge syndrome lookup table, hence the memory error.

I replaced it by a [6, 3] code over GF(7), so it should be fine now.

Reopening for review...

David

comment:48 Changed 6 years ago by jsrn

  • Status changed from needs_review to positive_review

Hah, that's pretty funny: before, the syndrome decoder was actually a nearest neighbor decoder, and therefore ran in q^k. Now, it's really a syndrome decoder and runs in q^(n-k), which is too big for the previous example in the tutorial.

Perhaps, a linear code should default to using syndrome or nearest neighbor depending on which of q^(n-k) or q^k is biggest?

For this ticket: green again.

comment:49 Changed 6 years ago by vbraun

  • Branch changed from u/dlucas/generic_decoders to 2bef528d821af2b822fbca47d4f4928f2b0131ad
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.