Opened 3 years ago

Closed 21 months ago

#20138 closed enhancement (fixed)

Information set decoding for linear codes

Reported by: dlucas Owned by:
Priority: major Milestone: sage-8.0
Component: coding theory Keywords:
Cc: jlavauzelle Merged in:
Authors: David Lucas, Johan Rosenkilde, Yann Laigle-Chapuy Reviewers: Yann Laigle-Chapuy
Report Upstream: N/A Work issues:
Branch: eb2efb3 (Commits) Commit: eb2efb3d03f135e4f163bf3b2554c2a8d47054ff
Dependencies: #19653 Stopgaps:

Description

This ticket proposes an implementation of Lee-Brickell algorithm refinement for the information set decoding.

It depends on #19653 for the reimplementation of C.random_element.

Change History (74)

comment:1 Changed 3 years ago by dlucas

  • Branch set to u/dlucas/information_set_decoding

comment:2 Changed 3 years ago by dlucas

  • Commit set to 645e329a716a57f16e64c972f46779e4e0c1905f

I pushed the first version of the code for this ticket.


New commits:

884d8abNew decoder, InformationSetDecoder
351b30cSet this new decoder in other code families and in the catalog
645e329Fixed broken doctests

comment:3 Changed 3 years ago by git

  • Commit changed from 645e329a716a57f16e64c972f46779e4e0c1905f to 0e1391d358018382db6cc0a59014ef934239987e

Branch pushed to git repo; I updated commit sha1. New commits:

6292df9window_size and number_errors are now class arguments
e72a5abRewrote _repr_ and _latex_ methods, fixed a typo
f2c074cdecode_to_code now iterates over the whole range of provided errors
0e1391dAdded *args to Decoder/Encoder-related methods in AbstractLinearCode

comment:4 Changed 3 years ago by dlucas

  • Authors set to David Lucas
  • Status changed from new to needs_review

I made some changes:

  • window_size and number_errors (previously p and w) are no longer parameters of decode_to_code but class arguments to provide at construction time. number_errors can be provided as a tuple/list, and in that case, the decoder will iterate over it when trying to decode a word.
  • I completely reworked the documentation, explained how the algorithm works and added a lot of doctests.
  • I completed input sanitization with some extra checks.
  • I fixed a stupid bug: only **kwargs was considered on Decoder/Encoder?-related methods from AbstractLinearCode, like decoder(). So, C.decoder('InformationSet', 2, 2) was not working...

This is now open for review.

comment:5 Changed 3 years ago by git

  • Commit changed from 0e1391d358018382db6cc0a59014ef934239987e to 81927695b58d5e4543d4a59b9d1b2c4d730ddd35

Branch pushed to git repo; I updated commit sha1. New commits:

8192769Updated to 7.1beta6 and fixed conflicts

comment:6 Changed 3 years ago by git

  • Commit changed from 81927695b58d5e4543d4a59b9d1b2c4d730ddd35 to 924be79fc189c2a9c563b9bfbe240ad854c16101

Branch pushed to git repo; I updated commit sha1. New commits:

465cd56Merge branch 'develop' into information_set_decoding
59de997Added information set decoder to Hamming codes
924be79Minor changes to the information set decoder

comment:7 Changed 3 years ago by dlucas

  • Milestone changed from sage-7.1 to sage-7.2

I updated this ticket to latest beta, fixed broken doctests, and did minor changes to the decoder (I refined its types and rewrote a line in the documentation). It's still open for review.

comment:8 Changed 3 years ago by git

  • Commit changed from 924be79fc189c2a9c563b9bfbe240ad854c16101 to 93066b139f579636a913f934e51046e7753cba8a

Branch pushed to git repo; I updated commit sha1. New commits:

62468b9Updated to latest beta and fixed conflict
93066b1Fixed silly copy-paste mistake in ISD's decoder_type

comment:9 Changed 3 years ago by dlucas

  • Milestone changed from sage-7.2 to sage-7.3

I updated this ticket to 7.3b2, and fixed a bug. Still open for review.

Best,

David

comment:10 Changed 3 years ago by ylchapuy

Here are my two cents:

  • you should use itertools instead of Subsets
  • you shouldn't enumerate the information_set but take them at random (otherwise you might get stuck a long time with bad choices)
  • why use while True loops instead of just for?

all in all, here is how I would do it (notations from the article you cite):

import itertools
def LeeBrickell(r, C, w, p):
    G = C.generator_matrix()
    n = C.length()
    k = C.dimension()
    Fstar = list(C.base_ring())[1:]
    while True:
        # step 1.
        I = sample(range(n), k)
        Gi = G.matrix_from_columns(I)
        try:
            Gi_inv = Gi.inverse()
        except ZeroDivisionError:
            continue
        G = Gi_inv * G
        #step 2.
        y = r - vector([r[i] for i in I]) * G
        g = G.rows()
        #step 3.
        for A in itertools.combinations(range(k), p):
            for m in itertools.product(Fstar, repeat=p):
                e = y - sum(m[i]*g[A[i]] for i in range(p))
                if e.hamming_weight() == w:
                    return r - e

comment:11 Changed 3 years ago by ylchapuy

  • Reviewers set to Yann Laigle-Chapuy
  • Status changed from needs_review to needs_work

You should also check if the resulting word is in the code (it might not be the case if you try to decode beyond the minimal distance).

comment:12 Changed 3 years ago by ylchapuy

In case you want to optimize this a bit, you can replace the itertools enumerations with sage.combinat.gray_codes stuff (and if the field is GF(2) get rid of the itertools.product).

This is probably better kept for another ticket.

comment:13 Changed 23 months ago by jsrn

  • Branch changed from u/dlucas/information_set_decoding to u/jsrn/information_set_decoding

comment:14 follow-up: Changed 23 months ago by jsrn

  • Authors changed from David Lucas to David Lucas, Johan Rosenkilde
  • Cc jlavauzelle added
  • Commit changed from 93066b139f579636a913f934e51046e7753cba8a to 25b2b9329104db756a02ae5ab52685e4ac853048

I've picked up this patch again: I've rebased it and polished it off but so far haven't changed the algorithm. This is not up for review yet!

Thanks, Yann, for your excellent suggestions which is the next thing I'll look into. I'm sorry you didn't get any response for so long.

I've changed the following things so far:

  • Improved some of the documentation
  • Change the doc-tests to use a specific code (Golay) instead of random ones.
  • Changed the calling convention so window_size is optional (I have yet to compute a sensible default value, though).
  • Improved the error message from AbstractLinearCode.encode/decode when not satisfying the decoder's constructor. This came up since the information set decoder still has 1 mandatory argument.
  • Fixed the new decoder's types (according to the now better documented types #20001) and add some appropriate new ones.
  • Fixed various errors in sage/coding that came up: some bad printing in Golay codes, removed old deprecated method LinearCode.decode.

My next step is to look at Yann's suggestion for improving the algorithm and it's efficiency.


New commits:

3b41dd5Rebase on 8.0.beta8. Roll back mods outside linear_code.py and thematic tutorial
0f8bdb3Went over documentation and simplified calling convention
57cf6b6Fixed Golay code Finite Field printing
cf556abFix doctests
d021e1bRemove `AbstractLinearCode.decode` which has been deprecated 18 months
43c229cHelpful error messages for encoder/decoder construction with args
25b2b93decoder types: for new decoder, remove "unique", and fix some descriptions

comment:15 follow-up: Changed 23 months ago by ylchapuy

No problem for the delay, I'm far from following this closely :)

A side remark though: _registered_decoders is a dict and decoders_available returns its keys, so the ordering is not reliable for doctests. I would advocate always returning a sorted list as it's not a time critical method.

comment:16 in reply to: ↑ 15 Changed 23 months ago by jsrn

Replying to ylchapuy:

No problem for the delay, I'm far from following this closely :)

A side remark though: _registered_decoders is a dict and decoders_available returns its keys, so the ordering is not reliable for doctests. I would advocate always returning a sorted list as it's not a time critical method.

Good point - I'll fix this while we're at it.

comment:17 Changed 23 months ago by git

  • Commit changed from 25b2b9329104db756a02ae5ab52685e4ac853048 to 7551c478b8d37882dd2a12e93dc2bc55d5ff89c5

Branch pushed to git repo; I updated commit sha1. New commits:

7551c47Sort all doc-tests for decoders/encoders_available

comment:18 follow-up: Changed 23 months ago by jlavauzelle

Hi Johan,

I had a quick look at this ticket. For the moment, I didn't review the algorithm but I have a few remarks on the rest of the code.

While running doctests, I obtained 3 errors:

  • in golay_code.py, line 169, a bracket misses in the doctest
  • in decoder.py, lines 105 and 113. That's because the type unique has been removed from the decoders' types of LinearCodeSyndromeDecoder and LinearCodeNearestNeighborDecoder (see in linear_code.py) but not from the doctest.

In fact, you seem to have removed the type unique from every decoder_type; is this because you consider that a decoder is a unique decoder as long as it isn't claimed not to be like that?

Besides, I have a more stuctural remak. Starting from Prange algorithm, there are many variants of the information set decoding algorithm (Lee-Brickell, birthday decoding, Stern-Dumer, MMT, BJMM), the last ones being quite recent. The name InformationSetDecoder implicitly assumes we're planning to implement only one of them (Lee-Brickell). The question is: if someone wants to implement another variant, what should he do? I think there are two solutions:

  1. the class InformationSetDecoder will contain all the implemented variants of Prange algorithm (that is, for the moment only Lee-Brickell). If another algorithm apears in Sage, then it will be callable through an optional argument of the constructor, e.g. __init__(algo="LeeBrickell", **kwargs).
  1. We define a class per algorithm, and InformationSetDecoder is a pointer to LeeBrickellISD.

In my humble opinion the second one is better (it is what we did for the various decoders of GRS codes), but I do not have time to rework the code during the following weeks.

Julien

comment:19 in reply to: ↑ 14 Changed 23 months ago by ylchapuy

Replying to jsrn:

  • Change the doc-tests to use a specific code (Golay) instead of random ones.

It would be good to continue to test also non-binary codes (because we can!)

  • Fixed various errors in sage/coding that came up: some bad printing in Golay codes, removed old deprecated method LinearCode.decode.

I understand it's easier, but isn't it best practice to have self contained tickets?

comment:20 in reply to: ↑ 18 Changed 23 months ago by jsrn

Hi Julien,

Thanks for the feedback.

Replying to jlavauzelle:

While running doctests, I obtained 3 errors:

Thanks, stupid mistakes. I'll fix it presently.

In fact, you seem to have removed the type unique from every decoder_type; is this because you consider that a decoder is a unique decoder as long as it isn't claimed not to be like that?

Yes. This finally became clear in #20001 when the types were documented better, though it was forgotten in #20001 to actually go over all decoders again and fix their types.

Besides, I have a more stuctural remak.

That's a good point. I would leave it to posterity to decide this. My own opinion is that a single class with an algorithm flag is sufficient, since all the decoders have exactly the same overall behaviour (same types) and only differ in speed, in some sense. Several of the decoders for GRS codes are of very different type.

If someone comes along and implements, say Stern-Dumer, the choice can easily be made at that point.

Replying to ylchapuy:

Change the doc-tests to use a specific code (Golay) instead of random ones.

It would be good to continue to test also non-binary codes (because we can!)

Good point!

Fixed various errors in sage/coding that came up: some bad printing in Golay codes, removed old deprecated method LinearCode?.decode.

I understand it's easier, but isn't it best practice to have self contained tickets?

Yes it's best practice. My experience with the trac ticket-and-review system, however, is that if best practice is followed completely, then trivial mistakes will never get fixed. I'm tired of coming across small mistakes here and there while editing codes and not being able to fix them, and then deal with ticket dependencies etc. My impression is that many of the very active developers on SageMath do the same.

Removing the deprecated function is perhaps debatable, though.

comment:21 Changed 23 months ago by ylchapuy

Regarding the algorithm, it *really* should be corrected. Try e.g.:

sage: C = codes.random_linear_code(GF(3), 24, 10)
sage: c = C.random_element()
sage: Chan = channels.StaticErrorRateChannel(C.ambient_space(), 4)
sage: y = Chan(c)
sage: %time LeeBrickell(y, C, 4, 2) == c
CPU times: user 16 ms, sys: 4 ms, total: 20 ms
Wall time: 20.5 ms
True
sage: D = C.decoder("InformationSet", 4)
sage: %time D.decode_to_code(y)
<... still waiting ... >

comment:22 Changed 23 months ago by git

  • Commit changed from 7551c478b8d37882dd2a12e93dc2bc55d5ff89c5 to 46b0907b4a8a196dcc1dc8d0cdbe7a26b51bac37

Branch pushed to git repo; I updated commit sha1. New commits:

0bb6785Fix doc-test failures
46b0907Doc-test for non-binary information set decoding (currently fails)

comment:23 Changed 23 months ago by jsrn

Yep, the algorithm is broken which I've now seen trying it on a GF(3) Golay code. I've pushed the example, but this is obviously still needs work.

As I mentioned earlier, going over the algorithm was next order of business anyway.

comment:24 Changed 23 months ago by git

  • Commit changed from 46b0907b4a8a196dcc1dc8d0cdbe7a26b51bac37 to a81785a709f1303b381535abd2208e73f2e97118

Branch pushed to git repo; I updated commit sha1. New commits:

a81785aUse ylchapuy's implementation of Lee-Brickell

comment:25 Changed 23 months ago by jsrn

I've replaced the algorithm with Yann's implementation from Comment 10, with only trivial modifications. This one works on the GF(3) doctest. I still want to do more testing and timing, so this is not up for review yet.

comment:26 Changed 23 months ago by ylchapuy

Note that I opened #23244 because I plan to implement Stern-Dumer.

comment:27 follow-up: Changed 23 months ago by jsrn

That's awesome!

I'm currently working on calibrating the window parameter - I find this an integral part of the Lee--Brickell algorithm because it has a huge effect on the average running time (and a user would have no idea how to set it). Unfortunately, it's a bit hit and miss when I have time to work on this, but I'll try to get it wrapped up.

comment:28 in reply to: ↑ 27 ; follow-up: Changed 22 months ago by ylchapuy

Replying to jsrn:

That's awesome!

I'm currently working on calibrating the window parameter - I find this an integral part of the Lee--Brickell algorithm because it has a huge effect on the average running time (and a user would have no idea how to set it). Unfortunately, it's a bit hit and miss when I have time to work on this, but I'll try to get it wrapped up.

First, I don't understand why you changed p to win. There is no notion of window here to me. Every paper I saw call this parameter p and I think it would be better to stick to it, as w is used for the weight of the expected error.

I'll use the notation n,k for the length and dimension, w for the weight of the error and p for the parameter of the algorithm.

Regarding the choice of this p parameter, first the analysis is usually made for constant error weight correction (when tau is a singleton) because it's easier. I'll do that too. Moreover analysis is usually done with a fixed p (I mean no loop as in you line 5303), see e.g. http://fms2.cerimes.fr/vod/media/canalu/documents/fuscia/3.5.lee.and.brickell.algorithm_32885/c015im.w3.s5.v3.pdf for the binary case. But I think it's worth doing the loop.

The expected cost C is C_I / P_I where C_I is the cost of one iteration (from line 5289) and P_I the probability of success of this iteration (the iterations are independant because we choose I at random, so P_I is a constant). ` C_I is G + x * V with G the cost of step 1 and 2 (mostly a Gaussian elimination), x the number of error vector e we compute and V the cost to compute each of them (this depends on the associated p in the current implementation but could be made constant if we later use e.g. gray codes to compute the sum).

Each vector e we compute increases P_I by a small amount E which depends on the current p, (the bigger the p, the smaller the gain).

Overall the cost function is piecewise defined:

C(x) = (G + x*V) / (P(p) + x*E(p)) with P(p) the success probability if we stop before p [i.e P(p) = sum_{i=0}^{p-1} E(i) * binomial(k,i)*(q-1)**i]

The sign of the derivative depends only on G*E(p) - V*P(p). G and V are implementation dependant but you can estimate them, E and P can be computed.

You could do this computation when you construct the decoder to choose p (it will be fast because p will never get bigger than 3 I think).

I did my best but I'm not certain my explanations are crystal clear! Does this makes sense to you?

Last edited 22 months ago by ylchapuy (previous) (diff)

comment:29 in reply to: ↑ 28 ; follow-ups: Changed 22 months ago by jsrn

Replying to ylchapuy:

First, I don't understand why you changed p to win. There is no notion of window here to me. Every paper I saw call this parameter p and I think it would be better to stick to it, as w is used for the weight of the expected error.

I think that a name is more evocative than a letter, in spite of tradition in mathematics text. "Window" was the word David used in the original implementation, and I couldn't come up with a substantially better word. It also has a nice, clear 3-letter abbreviation. Using w for error weight is not more traditional than e.g. e, t or tau in the greater decoding literature. I changed it since it visually collides with win, so I used tau in my implementation. I believe it crops up elsewhere in sage/coding since I commonly use this this.

I'll use the notation n,k for the length and dimension, w for the weight of the error and p for the parameter of the algorithm. ...

Thanks. I already implemented all of this a few days ago, exactly using the equations you write. Where I left it then was at the estimates of E and P which have to be devilishly precise: I've got E down well, but the code simplification I did when estimating P turned out to be too coarse (I believed it depended mostly only on the cost of scaling and summing vectors, but apparently the hamming_weight and loop structures add something). Testing this takes time because I need to run the decoder many times on a non-trivial code to be sure things work. I'll post my test code here as well, which might be useful for ou in #23244.

I'll look at it again now and push what I have in any case.

comment:30 in reply to: ↑ 29 ; follow-up: Changed 22 months ago by ylchapuy

Replying to jsrn:

Replying to ylchapuy:

First, I don't understand why you changed p to win. There is no notion of window here to me. Every paper I saw call this parameter p and I think it would be better to stick to it, as w is used for the weight of the expected error.

I think that a name is more evocative than a letter, in spite of tradition in mathematics text.

as a parameter yes

"Window" was the word David used in the original implementation, and I couldn't come up with a substantially better word.

Can you explain me what window this would be? I really don't get it.

I propose partial_weight, this is the number of error positions we expect in our information set (and is abbreviated as p).

It also has a nice, clear 3-letter abbreviation. Using w for error weight is not more traditional than e.g. e, t or tau in the greater decoding literature. I changed it since it visually collides with win, so I used tau in my implementation. I believe it crops up elsewhere in sage/coding since I commonly use this this.

You cite a paper in the doctest. I always find it nicer to use the same notations in the implementation as in the paper, it makes life easier for anyone trying to understand what your code does.

But in the end I guess it's your call, I won't fight about this.

comment:31 in reply to: ↑ 29 Changed 22 months ago by ylchapuy

Replying to jsrn:

Replying to ylchapuy:

I'll use the notation n,k for the length and dimension, w for the weight of the error and p for the parameter of the algorithm. ...

Thanks. I already implemented all of this a few days ago, exactly using the equations you write. Where I left it then was at the estimates of E and P which have to be devilishly precise: I've got E down well, but the code simplification I did when estimating P turned out to be too coarse (I believed it depended mostly only on the cost of scaling and summing vectors, but apparently the hamming_weight and loop structures add something).

E(p) should be binomial(n-k, w-p) / (binomial(n,w) * (q-1)**p) and I wrote the formula for P(p).

I guess you mean estimating G and V. They don't need to be that precise. If they're a bit wrong, the final cost estimate will only be a bit wrong, nothing catastrophic should happen.

Testing this takes time because I need to run the decoder many times on a non-trivial code to be sure things work. I'll post my test code here as well, which might be useful for ou in #23244.

I don't want to push you, I only like to write down my thoughts, you may ignore them. I know this takes time and we all have other things to do!

comment:32 in reply to: ↑ 30 ; follow-up: Changed 22 months ago by jsrn

Replying to ylchapuy:

I propose partial_weight, this is the number of error positions we expect in our information set (and is abbreviated as p).

You cite a paper in the doctest. I always find it nicer to use the same notations in the implementation as in the paper, it makes life easier for anyone trying to understand what your code does.

That's a valid reason. The cited paper is not the original contribution and the notation looked ad hoc to me, so I didn't heed it in this instance. But I'm not well-versed in ISD literature, and if you say the parameter is often p that's a stronger argument. Did you come up with the partial_weight or did you see that somewhere? It sounds bland to me, but it does share the p.

comment:33 in reply to: ↑ 32 Changed 22 months ago by ylchapuy

Replying to jsrn:

Replying to ylchapuy:

I propose partial_weight, this is the number of error positions we expect in our information set (and is abbreviated as p).

You cite a paper in the doctest. I always find it nicer to use the same notations in the implementation as in the paper, it makes life easier for anyone trying to understand what your code does.

That's a valid reason. The cited paper is not the original contribution and the notation looked ad hoc to me, so I didn't heed it in this instance. But I'm not well-versed in ISD literature, and if you say the parameter is often p that's a stronger argument.

It's not used in the original publication (there it's j) but it's used at least since Stern's paper in 1989, in all the papers proposing improvements (and so is w for the expected weight of the error).

Did you come up with the partial_weight or did you see that somewhere? It sounds bland to me, but it does share the p.

I just came up with this, but I agree it's not particularly good. I don't know any wordy definition of this parameter.

comment:34 Changed 22 months ago by git

  • Commit changed from a81785a709f1303b381535abd2208e73f2e97118 to 476369c26dde394949e074270181b4fa7a652bd8

Branch pushed to git repo; I updated commit sha1. New commits:

d6f69b8Rename number_errors() method to decoding_interval() and decoding_radius() method
a074919Calibrate window parameter, take 1
7ab7277Calibrate window parameter, take 2
476369cRename window to search_size, win -> p, w -> p or pi

comment:35 Changed 22 months ago by jsrn

I've pushed an attempt at writing the calibration of the search size parameter (which I call it now, abbreviated p). I monkey'd around for a very long while thinking it didn't work, but I must have confused myself terribly, because when I tried to clean everything up in order to write this report on my progress, it suddenly doesn't seem too bad. I believe one gotcha that confounded me was the inclusion of a rho factor: the fraction of size-k-positions that are information sets.

Anyway, here are some timings:

[24, 12, 8] Extended Golay code over GF(2)
		Estimated		Actual
	p=0	0.00294093602195062	0.00370841026306
	p=1	0.000733640004873450	0.00105549812317
	p=2	0.000779351222057502	0.000777833461761
	p=3	0.00202210000281411	0.00085108757019

[12, 6, 6] Extended Golay code over GF(3)
		Estimated		Actual
	p=0	0.00107591199443414	0.00152409553528
	p=1	0.000475508233346987	0.000670518875122
	p=2	0.000837919998575671	0.000634233951569

[255, 231] BCH Code over GF(2) with designed distance 7
		Estimated		Actual
	p=0	8.27393653065493	10.0903596711
	p=1	0.303782262811033	0.38420484066
	p=2	0.613038331151469	0.627587425709
	p=3	13.5825719377659	6.28117632627

[255, 139] BCH Code over GF(2) with designed distance 31
		Estimated		Actual

[255, 47] BCH Code over GF(2) with designed distance 81
		Estimated		Actual
	p=0	7.08824734720053	6.80084561825
	p=1	0.730343868369564	0.871155004501
	p=2	0.776726300346984	0.963895459175
	p=3	3.16273106617629	4.46240173101
	p=4	14.6264398618229	16.1132382083

[80, 56] BCH Code over GF(3) with designed distance 9
		Estimated		Actual
	p=0	0.652945537606786	0.400583426952
	p=1	0.104725869935881	0.0847134900093
	p=2	0.651413542048857	0.519779400826
	p=3	11.2044949381990	5.3165213418

[80, 15] BCH Code over GF(3) with designed distance 41
		Estimated		Actual
	p=0	0.0524764488533255	0.0787803530693
	p=1	0.0235392606723455	0.0252663660049
	p=2	0.0786084509854239	0.0758757901192
	p=3	0.340890763500404	0.248177850246
	p=4	1.46166863051819	0.734001851082
	p=5	5.69338143224234	1.27127592802
	p=6	19.6762218536446	2.85639382839

[124, 94] BCH Code over GF(5) with designed distance 13
		Estimated		Actual
	p=0	51.8959970988144	(25.518362093 last time, skipped here)
	p=1	8.16363820890485	13.5431237435

Here follows some of my test code:

def test_decoder(D, reps=100):
    r"""Return the time it takes to run the decoder on average, with maximal
    number of errors."""
    C = D.code()
    tau = D.decoding_radius()
    Chan = channels.StaticErrorRateChannel(C.ambient_space(), tau)
    tsum = 0
    for i in range(reps):
        c = C.random_element()
        r = Chan(c)
        before = time.clock()
        cout = D.decode_to_code(r)
        after = time.clock()
        assert c == cout , "decoding error"
        tsum += after - before
    return tsum/reps

from sage.coding.linear_code import LinearCodeInformationSetDecoder

test_codes = [
    ( 3, codes.GolayCode(GF(2))),
    ( 2, codes.GolayCode(GF(3))),
    ( 3, codes.BCHCode(GF(2), 2^8-1, 7)),
    (15, codes.BCHCode(GF(2), 2^8-1, 31)),
    (40, codes.BCHCode(GF(2), 2^8-1, 81)),
    ( 4, codes.BCHCode(GF(3), 3^4-1, 9)),
    (20, codes.BCHCode(GF(3), 3^4-1, 41)),
    ( 6, codes.BCHCode(GF(5), 5^3-1, 13))
    ]

for (tau, C) in test_codes:
    print C
    D = LinearCodeInformationSetDecoder(C, tau)
    estimates = D._calibrate_search_size()
    print "\t\tEstimated\t\tActual"
    for p in range(0,tau+1):
        if estimates[p] < 30:
            D._search_size = p
            avg = test_decoder(D, reps=5)
        else:
            avg = "skipped"
        print "\tp=%s\t%s\t%s" % (p, estimates[p], avg)
    print

def test_iters(D):
    r"""Return the avg number of iterations taken by the decoder, including
    those where a non-information set is picked.
    This requires debug code inserted to count `self._iters` in the decoder.
    """
    N = 101
    iters = []
    for i in range(N):
        test_decoder(D, reps=1)
        iters.append(D._iters)
    print median(iters) , 1. * sum(iters) / N

def infoset_density(C, reps=1000):
    r"""Estimate density of information sets"""
    s = 0
    G = C.generator_matrix()
    n, k = C.length(), C.dimension()
    for N in range(reps):
        I = sample(range(n), k)
        Gi = G.matrix_from_columns(I)
        try:
            Gi_inv = Gi.inverse()
            s += 1
        except ZeroDivisionError:
            pass
    return 1. * s / (N+1)

for _,C in test_codes:
    print "%s\t%s" % (C, infoset_density(C))


New commits:

d6f69b8Rename number_errors() method to decoding_interval() and decoding_radius() method
a074919Calibrate window parameter, take 1
7ab7277Calibrate window parameter, take 2
476369cRename window to search_size, win -> p, w -> p or pi
Last edited 22 months ago by jsrn (previous) (diff)

comment:36 follow-up: Changed 22 months ago by ylchapuy

Great, I'll take a closer loop later. Here are some preliminary remarks:

  • you shouldn't worry about rho, consider as an iteration only those with a valid information set. It's already what you measure with time_information_set_steps, though I would take the mean and not the median here.
  • beware, _search_size is initialized to None which leads to trouble. You should just leave it uninitialized.

comment:37 in reply to: ↑ 36 ; follow-up: Changed 22 months ago by jsrn

I updated the timings after a longer run with more iterations.

Replying to ylchapuy:

Great, I'll take a closer loop later. Here are some preliminary remarks:

  • you shouldn't worry about rho, consider as an iteration only those with a valid information set. It's already what you measure with time_information_set_steps, though I would take the mean and not the median here.

I disagree: it changes the balance between the inversion step (done always) and the search step (done a rho fraction of the time). Perhaps it's not paramount, but I've already implemented it, and it makes debugging nice since the number of iterations matches extremely well.

  • beware, _search_size is initialized to None which leads to trouble. You should just leave it uninitialized.

Thanks, that's a bug: I put the call to calibration in search_size() which is only called if _search_size is unset.

comment:38 Changed 22 months ago by git

  • Commit changed from 476369c26dde394949e074270181b4fa7a652bd8 to fa7240261271df724b145d6a67433cb007c143e3

Branch pushed to git repo; I updated commit sha1. New commits:

0e6ea11Let calibration return time estimates for easier debugging
fa72402fixed search_size init bug

comment:39 in reply to: ↑ 37 ; follow-up: Changed 22 months ago by ylchapuy

Replying to jsrn:

I updated the timings after a longer run with more iterations.

Replying to ylchapuy:

Great, I'll take a closer loop later. Here are some preliminary remarks:

  • you shouldn't worry about rho, consider as an iteration only those with a valid information set. It's already what you measure with time_information_set_steps, though I would take the mean and not the median here.

I disagree: it changes the balance between the inversion step (done always) and the search step (done a rho fraction of the time). Perhaps it's not paramount, but I've already implemented it, and it makes debugging nice since the number of iterations matches extremely well.

I disagree as well! What time_information_set_steps measures is the time to get a valid information set. The while True is included into it. As is your value T is off by a factor rho.

comment:40 in reply to: ↑ 39 Changed 22 months ago by jsrn

Replying to ylchapuy:

I disagree as well! What time_information_set_steps measures is the time to get a valid information set. The while True is included into it. As is your value T is off by a factor rho.

Oh yes, now I see, you're right. And changing the medians to means then it will probably be precise enough. The alternative is to let time_information_set_steps measure only the time once an information set is found. If e.g. q=7 then the probability of an information set (on a random code) is roughly 5/6 so there will be a fair chance that each of the 5 runs of time_information_set_steps hit an information set on the first guess - and then we will have underestimated the average time by a factor 6/5. Which likely doesn't matter at all.

comment:41 Changed 22 months ago by git

  • Commit changed from fa7240261271df724b145d6a67433cb007c143e3 to ea5c36c00c1043ee72ca2837da96e404bf4c8907

Branch pushed to git repo; I updated commit sha1. New commits:

ea5c36crho is already included. Mean instead of median

comment:42 Changed 22 months ago by jsrn

I removed the rho factor and updated the documentation.

comment:43 Changed 22 months ago by ylchapuy

This looks nicer. Another thing I that bothers me is the search_size method.

This parameter is (sort of) specific to the Lee-Brickell algorithm, not to ISD. In preparation for other variants available (e.g. Stern-Dumer), I would prefer to hide this from the user.

I would either hide it behind an "underscored" method, or provide a more general one, e.g.

    def parameters(self):
        r"""
         Return the algo and parameters used
        """
        return ("LeeBrickell", self._search_size)

or

    def parameters(self):
        r"""
         Return the parameters selected for each available algorithms
        """
        return { "LeeBrickell": self._search_size }

or anything better you can think of.

comment:44 follow-up: Changed 22 months ago by jsrn

This is essentially the issue that Julien brought up earlier: at what level, and how, should the different ISD implementations differ. I had proposed to defer this question till when it came up, but considering #23244 we might just as well decide something now. Julien's suggestion is to have entirely different classes. I suggested that shoving them all in the same class could be sufficient. But I'm not an expert in the other ISD algorithms, and as you point out, they take different parameters.

Your suggestion would allow for the necessary flexibility, but is essentially an indirection put in place to avoid subclassing. I don't find it particularly elegant, and it would be surprising to users I think. Granted, a typical user wouldn't care about the parameters.

Consider instead having a class for each, taking and naming their parameters as natural to each, and all subclassing a AbstractInformationSetDecoder. Then we make a function:

    def LinearCodeInformationSetDecoder(code, number_errors, algorithm=None, *args, **kwargs):
        if algorithm == None:
            algorithm = "LeeBrickell" if code.dimension() < 10 else "SternDumer"
        if algorithm == "LeeBrickell":
            return LinearCodeISLeeBrickellDecoder(code, number_errors, *args, **kwargs)
        ...

Preferably improved with a try ... except which gives a nicer error message in case args, kwargs is not proper. Probably, supplying args or kwargs should also be disallowed if algorithm is not explicitly set, in order to avoid user code silently breaking in case we sometime decide to change the automatic selection of algorithm.

comment:45 in reply to: ↑ 44 Changed 22 months ago by ylchapuy

Replying to jsrn:

This is essentially the issue that Julien brought up earlier: at what level, and how, should the different ISD implementations differ. I had proposed to defer this question till when it came up, but considering #23244 we might just as well decide something now. Julien's suggestion is to have entirely different classes. I suggested that shoving them all in the same class could be sufficient. But I'm not an expert in the other ISD algorithms, and as you point out, they take different parameters.

I agree it's not elegant. My aim was to make this ticket as easy to close as possible. I don't want to make everything perfect, but just don't want to provide user functions we would have to deprecate in the future. As long as we have only underscored methods it's not a problem to me and we can decide later what is the best way to deal with this.

If you want to decide this now, I also think some general factory LinearCodeInformationSetDecoder choosing the best implementation at hand for those parameters, and specialized classes for users who know what they are doing seems the way to go for me.

comment:46 Changed 22 months ago by git

  • Commit changed from ea5c36c00c1043ee72ca2837da96e404bf4c8907 to 5463b1d86c8b813f3c608c69d2af9674eb7c392f

Branch pushed to git repo; I updated commit sha1. New commits:

87bd34dRestructure ISD with class hierarchy and factory method
59319bdDocs for new classes and full doctest coverage
e0a6677Move all ISD code to new file
5463b1dFix code and doctests after code migration

comment:47 follow-up: Changed 22 months ago by jsrn

I've implemented a class hierarchy and a factory function as described earlier. Since the ISD code appears to get lengthy, I've moved it to a new file.

Doc-tests pass, but I'm having trouble with documentation. I haven't inspected the compiled html doc yet because I got stuck in attempting to make a proper hyperlink for sage.coding.isd_decoding.LinearCodeInformationSetDecoder in the module doc of sage.coding.decoder_catalog. I'll look at it again soon.

comment:48 in reply to: ↑ 47 ; follow-up: Changed 22 months ago by ylchapuy

Replying to jsrn:

I've implemented a class hierarchy and a factory function as described earlier. Since the ISD code appears to get lengthy, I've moved it to a new file.

Great, I agree it's a nice thing to put it in a separate file. Though the d in isd is for decoding, so isd_decoding is a bit strange.

I look forward to review this when it' s ready, thanks for working on this!

comment:49 in reply to: ↑ 48 ; follow-up: Changed 22 months ago by jsrn

Great, I agree it's a nice thing to put it in a separate file. Though the d in isd is for decoding, so isd_decoding is a bit strange.

True. But isd is a very bland abbreviation and I suspect it will be hard for most people to look at the file name and have an idea of its contents. And is_decoding is weird. Do you have a suggestion?

Similarly, I had great difficulty coming up with a proper name for the class LinearCodeISD_LeeBrickell. All other decoder classes end with Decoder. But LinearCodeInformationSetLeeBrickellDecoder is crazy long. And LinearCodeISLeeBrickellDecoder reads weirdly since IS doesn't remind of ISD. And I wanted all ISD-decoders to tab-complete similarly, so LinearCodeLeeBrickellISDecoder was out.

comment:50 in reply to: ↑ 49 Changed 22 months ago by ylchapuy

Replying to jsrn:

Great, I agree it's a nice thing to put it in a separate file. Though the d in isd is for decoding, so isd_decoding is a bit strange.

True. But isd is a very bland abbreviation and I suspect it will be hard for most people to look at the file name and have an idea of its contents. And is_decoding is weird. Do you have a suggestion?

why not information_set_decoding we have tab completion.

Similarly, I had great difficulty coming up with a proper name for the class LinearCodeISD_LeeBrickell. All other decoder classes end with Decoder. But LinearCodeInformationSetLeeBrickellDecoder is crazy long. And LinearCodeISLeeBrickellDecoder reads weirdly since IS doesn't remind of ISD. And I wanted all ISD-decoders to tab-complete similarly, so LinearCodeLeeBrickellISDecoder was out.

Hum, I see. What would you think about only one full fledge decoder LinearCodeInformationSetDecoder, and lightweight classes for the algorithms, e.g. (just sketching a solution):

class DecodingAlgorithm:
    """ Just providing an interface.
    I can use this bland name because it would in fact be
    sage.coding.information_set_decoding.DecodingAlgorithm
    """
    @staticmethod
    def decode(G, y, **kargs):
        """ kargs depends on the actual algorithm """
        raise NotImplementedError
    @staticmethod
    def cost_estimate(C, **kargs):
        """ Return ETA in seconds for the given code and parameters
        """
        raise NotImplementedError
    @staticmethod
    def calibrate_for(C):
        """ Return the best parameters found for this code, as a dict
        You can thus do:
        A.decode(C.generator_matrix(), y, **A.calibrate_for(C))
        """
        raise NotImplementedError

class AlgorithmLeeBrickell(DecodingAlgorithm):
    # Todo

class AlgorithmSternDummer(DecodingAlgorithm):
    # Todo

# etc

Last edited 22 months ago by ylchapuy (previous) (diff)

comment:51 Changed 22 months ago by git

  • Commit changed from 5463b1d86c8b813f3c608c69d2af9674eb7c392f to f68ea694a888638a1a839bd429a84ace1bb3a12e

Branch pushed to git repo; I updated commit sha1. New commits:

ed70a79Rename
fa06073Fixes after rename
f8b9c5dSwitched to ISD Algorithm classes and only one real decoder
f68ea69Fixes to docstrings

comment:52 Changed 22 months ago by jsrn

Finally got back to it again. I had originally considered light-weight algorithm classes but then rejected it since the surrounding ISD-class would be almost trivial. However, reconsidering I think it actually makes for a slightly nicer design, user-wise: we avoid a decoder factory (which won't behave as a decoder class which might surprise the user), it sidesteps the clumsy naming, and it becomes more transparent for a user how to make new ISD algorithms.

So I've implemented your suggestion. I went for real instantiated Algorithm-objects. This makes the algorithms more accessible to the user, and making things static would also pose problems with calibrate and time_estimate, which either both would have to actually run the calibration or write to static class variables or something.

As usual this took 30 minutes, and then 1,5 hours to redo the doc-strings.

comment:53 Changed 22 months ago by jsrn

  • Authors changed from David Lucas, Johan Rosenkilde to David Lucas, Johan Rosenkilde, Yann Laigle-Chapuy
  • Status changed from needs_work to needs_review

Needs review. I've added Yann as coauthor - he definitely deserves that!

Last edited 22 months ago by jsrn (previous) (diff)

comment:54 follow-up: Changed 22 months ago by ylchapuy

First review pass.

To begin with:

  • why give the copyright to William Stein? the format is also a bit off (see http://doc.sagemath.org/html/en/developer/coding_basics.html#headings-of-sage-library-code-files)
  • you should import as much as possible from sage.all instead of hardcoding the whole path (e.g. sage.modules.free_module_element) it's more future proof if things ever move around.
  • I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")
  • if you ask for time_estimate with user provided parameters, it currently do the calibration and change the parameters. Is this expected? In my mind, calibrate calls time_estimate with different parameters to choose the best one.

For the algorithm itself:

  • the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than p
  • in step 2, you should check if y is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)
  • you could use (but this doesn't change much)
    for gA in itertools.combinations(g, pi):
      for m in itertools.product(Fstar, repeat=pi):
        e = y - sum(mi*gAi for mi,gAi in zip(m,gA))
    
  • prefer ... is not None to not ... is None

For the decoder:

  • in __init__ you doctest only some exceptions and not all of them
  • you also don't doctest a user defined algorithm (as allowed by your test isinstance(algorithm, InformationSetAlgorithm)
  • in known_algorithms I would make d a class variable _known_algorithms

And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20) and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

comment:55 Changed 22 months ago by git

  • Commit changed from f68ea694a888638a1a839bd429a84ace1bb3a12e to c6ddf4318951e5619d5f0dfa1b91ddfb6e8a9f3e

Branch pushed to git repo; I updated commit sha1. New commits:

ee4d586Merge branch 'u/jsrn/information_set_decoding' of git://trac.sagemath.org/sage into 20138_information_set_decoding
ed4bfd7copyright and import statements
a4e7f49time_estimate() doesn't poop on selected parameters
339c383Fixed bugs and added doctests pointed out by reviewer.
0ea73d6Make _known_algorithms a class field
08a9e50Added a big example/doctest. Polished some existing ones.
c6ddf43Added a test for the precision of calibration

comment:56 in reply to: ↑ 54 ; follow-up: Changed 22 months ago by jsrn

Thanks for the speedy review. Here is my less speedy reply.

Replying to ylchapuy:

I hadn't looked up those rules to be honest. I added our three names. I don't know your email address though.

  • you should import as much as possible from sage.all instead of hardcoding the whole path (e.g. sage.modules.free_module_element) it's more future proof if things ever move around.

Didn't know that was possible. It's not used many places in Sage either. I don't see any drawbacks though, so I've made the change.

  • I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")

That's a matter of style. I would always write "Return the minimum distance of this code" rather than "Return the minimum distance" or "Return the minimum distance of self". Similarly, decoding is not canonical: the result depends on the method used.

There was a discussion on sage-devel on this recently. There was no clear consensus, but many seemed favoured the "of this code"-style.

I'll keep it like this for now, but feel free to give explicit examples where you find it jarring.

  • if you ask for time_estimate with user provided parameters, it currently

do the calibration and change the parameters. Is this expected? In my mind, calibrate calls time_estimate with different parameters to choose the best one.

Good catch: time_estimate of course shouldn't overwrite selected parameters. I've fixed that.

  • the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than p

No, we let pi go from 0 to p, both inclusive. What differs is that I, for brevity, just say we let the coefficients roam through F^p, while we actually go through F*^pi, and let pi go from 0 to p. This is to avoid repeating the elements with multiple 0s in the first version.

  • in step 2, you should check if y is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)

This is done in the iteration pi = 0.

  • you could use (but this doesn't change much)
    for gA in itertools.combinations(g, pi):
      for m in itertools.product(Fstar, repeat=pi):
        e = y - sum(mi*gAi for mi,gAi in zip(m,gA))
    

Yes I could, but I don't consider it more elegant. If I was writing in OCaml I'd do it in the above way, obviously, but not in Python.

  • prefer ... is not None to not ... is None

OK

For the decoder:

  • in __init__ you doctest only some exceptions and not all of them

I think all __init__s and class docs now have the appropriate doctests.

  • you also don't doctest a user defined algorithm (as allowed by your test isinstance(algorithm, InformationSetAlgorithm)

Thank you for insisting on a doctest of this: it revealed a bug and a design dilemma. The bug was that InformationSetAlgorithm didn't hash, which meant that AbstractLinearCode.encoder crashed since it tries to hash any arguments. This is potentially a problem with AbstractLinearCode.encoder which might need to be fixed at some point, but for this case I don't see a reason not to just supply a simple hash function for InformationSetAlgorithm.

The design dilemma is that when supplying an ISD algorithm, it is superfluous to also supply number_errors to InformationSetDecoder, since the algorithm knows its decoding interval. I thought about this and decided to leave it this way, i.e. the user must supply number_errors and algorithm and number_errors must match algorithm.decoding_interval(). My argument is that this use case is very rare and for special developer-users; but allowing the shortcut of only supplying algorithm would confuse the documentation and mess up the constructor.

  • in known_algorithms I would make d a class variable _known_algorithms

No problem.

And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20) and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).

I don't like to use a random code, since it might fail to decode (granted, with very low probability). I've added an example with the [59, 30, 17] QR code over GF(3), for which syndrome decoding or nearest neighbor decoding is infeasible. The current information-set decoding is extremely fast on the other hand. I've flagged it as long, though it takes less than 0.3 s.

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

A random doc-test should only fail with extremely low probability, otherwise it will cause headaches for other Sage devs and the build bot. The decoding time for information-set decoding is roughly exponentially distributed and has a huge variance. Thus we must average a large number of decoding trials and test this against the calibrated value.

I've implemented such a test for a medium-sized code, where we do 100 iterations, and I test that the elapsed time is within a factor 5 of calibrated. I did a 1-hour trial on my own computer with only light background processes, and factor 2 off calibrated time occurred after 400 trials while factor 3 off calibrated time occurred after 5000 trials:

from sage.coding.information_set_decoder import LeeBrickellISDAlgorithm
C = codes.QuadraticResidueCode(37, GF(3))
Chan = channels.StaticErrorRateChannel(C.ambient_space(), 4)
A = LeeBrickellISDAlgorithm(C, (0,4))
A.calibrate()
def time_100():
    A.calibrate()
    import time
    zero = C.ambient_space().zero()
    before = time.clock()
    for i in range(100): A.decode(Chan(zero)); None
    return time.clock() - before

smallest = 1
biggest = 1
i = 0
while True:
    i += 1
    elapsed = time_100()
    frac = elapsed/A.time_estimate()/100
    if frac < smallest:
        smallest = frac
    if frac > biggest:
        biggest = frac
    print "%d: %s, %s" % (i, smallest , biggest)

My theoretical musings on these probabilities indicated that - if the calibration is sound - this should occur far less frequently, however.

Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.

Any thoughts?

(The ticket now needs review again)

comment:57 in reply to: ↑ 56 ; follow-up: Changed 22 months ago by ylchapuy

Replying to jsrn:

Thanks for the speedy review. Here is my less speedy reply.

  • you should import as much as possible from sage.all instead of hardcoding the whole path (e.g. sage.modules.free_module_element) it's more future proof if things ever move around.

Didn't know that was possible. It's not used many places in Sage either. I don't see any drawbacks though, so I've made the change.

I thought at some point it was the way to go, but it's not used much as you say. I'm not so sure now but I like it that way :)

  • I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")

That's a matter of style. I would always write "Return the minimum distance of this code" rather than "Return the minimum distance" or "Return the minimum distance of self". Similarly, decoding is not canonical: the result depends on the method used.

There was a discussion on sage-devel on this recently. There was no clear consensus, but many seemed favoured the "of this code"-style.

I'll keep it like this for now, but feel free to give explicit examples where you find it jarring.

No keep it like that, I realized afterwards that there was no consensus on this.

  • the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than p

No, we let pi go from 0 to p, both inclusive. What differs is that I, for brevity, just say we let the coefficients roam through F^p, while we actually go through F*^pi, and let pi go from 0 to p. This is to avoid repeating the elements with multiple 0s in the first version.

OK, that's the problem with speedy review, I read too fast sorry.

  • in step 2, you should check if y is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)

This is done in the iteration pi = 0.

agreed

  • you could use (but this doesn't change much)
    for gA in itertools.combinations(g, pi):
      for m in itertools.product(Fstar, repeat=pi):
        e = y - sum(mi*gAi for mi,gAi in zip(m,gA))
    

Yes I could, but I don't consider it more elegant. If I was writing in OCaml I'd do it in the above way, obviously, but not in Python.

It seems also a bit faster in my tests, but you can keep it that way if you don't like this.

For the decoder:

  • in __init__ you doctest only some exceptions and not all of them

I think all __init__s and class docs now have the appropriate doctests.

Not all, e.g. raise ValueError("All elements of number_errors have to be" ... and raise ValueError("The provided number of errors should be at"... are missing I think.

  • you also don't doctest a user defined algorithm (as allowed by your test isinstance(algorithm, InformationSetAlgorithm)

Thank you for insisting on a doctest of this: it revealed a bug and a design dilemma. The bug was that InformationSetAlgorithm didn't hash, which meant that AbstractLinearCode.encoder crashed since it tries to hash any arguments. This is potentially a problem with AbstractLinearCode.encoder which might need to be fixed at some point, but for this case I don't see a reason not to just supply a simple hash function for InformationSetAlgorithm.

The design dilemma is that when supplying an ISD algorithm, it is superfluous to also supply number_errors to InformationSetDecoder, since the algorithm knows its decoding interval. I thought about this and decided to leave it this way, i.e. the user must supply number_errors and algorithm and number_errors must match algorithm.decoding_interval(). My argument is that this use case is very rare and for special developer-users; but allowing the shortcut of only supplying algorithm would confuse the documentation and mess up the constructor.

That's one of the reason my Algorithm classes were more light-weight than yours. They knew nothing and were pure functions accepting arguments. The decoder was the one knowing the parameters used, avoiding the duplication.

I think it's a better design, but I won't fight for this.

  • in known_algorithms I would make d a class variable _known_algorithms

No problem.

And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20) and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).

I don't like to use a random code, since it might fail to decode (granted, with very low probability). I've added an example with the [59, 30, 17] QR code over GF(3), for which syndrome decoding or nearest neighbor decoding is infeasible. The current information-set decoding is extremely fast on the other hand. I've flagged it as long, though it takes less than 0.3 s.

You could choose a random seed such that everything goes well. But I'm fine with your QR code.

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

[...]

Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.

If It's not clear how to make it right, maybe it's better left for another ticket?

For an improvement in the calibration, I think I can improve things both in precision and clarity! As we want to empirically get the timing, we don't need to separate the time_information_set_steps and time_search_loop. You could just add some options:

def decode(self, r, max_iter=None, _calibrate=False):
    ...
    i = 0
    while i != max_iter:
        i += 1
        ...
                if not _calibrate: return r - e   # when we calibrate, never return

This would end up with less code duplication and a more precise estimate. Once again, feel free to leave this to another ticket if you want, it wouldn't break anything to change this later.

Any thoughts?

If one chooses anything above d/2, the algorithm does not guarantee to return a nearest codeword. I would add we don't even guarantee to return a codeword!

(The ticket now needs review again)

I still have to check the rendering of the documentation, but this get's close to a positive review from me.

comment:58 Changed 22 months ago by git

  • Commit changed from c6ddf4318951e5619d5f0dfa1b91ddfb6e8a9f3e to 5924a39cd0c5a4bb8d44d47f48deb2add1ac330b

Branch pushed to git repo; I updated commit sha1. New commits:

1557ce9*args should pass to decoder in `AbstractLinearCode.decode_to_code/message`
395ce38Tests for all raised exceptions
5924a39Improved calibration doc test

comment:59 in reply to: ↑ 57 Changed 22 months ago by jsrn

Replying to ylchapuy:

I think all __init__s and class docs now have the appropriate doctests.

Not all, e.g. raise ValueError("All elements of number_errors have to be" ... and raise ValueError("The provided number of errors should be at"... are missing I think.

You're right, sorry. I went over the entire file more carefully now.

That's one of the reason my Algorithm classes were more light-weight than yours. They knew nothing and were pure functions accepting arguments. The decoder was the one knowing the parameters used, avoiding the duplication.

I think it's a better design, but I won't fight for this.

OK, let's not debate this further. If you feel strongly about it when you implement Stern-Dumer, you can implement the change there.

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

[...]

Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.

If It's not clear how to make it right, maybe it's better left for another ticket?

Possibly. I generally don't see how tests of calibration and threshold selection algorithms fit into Sage's doctesting ecology.

I slightly improved my solution from before (run multiple calibrations). I feel pretty confident that the test there now is better than nothing and should not really cause false negatives. If you don't feel comfortable with it we can also remove it again.

For an improvement in the calibration, I think I can improve things both in precision and clarity!

I see your idea. I originally wanted to avoid this for two reasons: 1) it pollutes the code of the real algorithm; and 2) it adds unnecessary logic to a busy inner-loop. Probably 2) is totally negligible though, and 1) is true but the alternative is code duplication.

Let's wait with this for another ticket. I'm waiting to see how Stern-Dumer performs in practice before spending lots of time optimising Lee-Brickell (I have other improvements in mind as well).

Keep in mind for your solution though, that you also need some mechanism to avoid large values of p where a single iteration would take very long.

If one chooses anything above d/2, the algorithm does not guarantee to return a nearest codeword. I would add we don't even guarantee to return a codeword!

What do you mean? The algorithm might not return, but if it does it surely returns a codeword. That it might not return should be covered by the red warning box below.

I still have to check the rendering of the documentation, but this get's close to a positive review from me.

Great.


New commits:

1557ce9*args should pass to decoder in `AbstractLinearCode.decode_to_code/message`
395ce38Tests for all raised exceptions
5924a39Improved calibration doc test

comment:60 Changed 22 months ago by ylchapuy

two low hanging fruits, the bot complains about

  • a doctest failure in src/doc/en/constructions/linear_codes.rst
  • a missing doctest:

Missing doctests in coding/information_set_decoder.py: 23 / 24 = 95%

it' s the __hash__.

comment:61 Changed 22 months ago by ylchapuy

  • Status changed from needs_review to needs_work

comment:62 Changed 22 months ago by git

  • Commit changed from 5924a39cd0c5a4bb8d44d47f48deb2add1ac330b to 50018f67b8c6c214f8156d21c68a504b493e772c

Branch pushed to git repo; I updated commit sha1. New commits:

50018f6Fix doctest failure in doc. Add docstring and test to __hash__

comment:63 Changed 22 months ago by jsrn

  • Status changed from needs_work to needs_review

Indeed. These omissions are fixed now.

comment:64 Changed 21 months ago by ylchapuy

  • Milestone changed from sage-7.3 to sage-8.0
  • Status changed from needs_review to positive_review

Ok, I finally took the time to build the doc and check it out. Build without a fuss and LGTM.

Your long doctest with codes.QuadraticResidueCode(59, GF(3)) takes around 150ms on my machine, I don't know if I would call this long but it's not a problem.

The bot is also green, so let's move forward: positive review :)

comment:65 Changed 21 months ago by jsrn

Great! Thanks for the cooperation - I think this has become a really nice addition to Sage. I'm looking forward to seeing your implementation in #23244.

comment:66 follow-up: Changed 21 months ago by vbraun

  • Status changed from positive_review to needs_work
sage -t --long src/sage/coding/information_set_decoder.py
**********************************************************************
File "src/sage/coding/information_set_decoder.py", line 553, in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
Failed example:
    A.time_estimate()/5 < avg and avg < A.time_estimate() * 5 # long time
Expected:
    True
Got:
    False
**********************************************************************
1 item had failures:
   1 of  15 in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
    [170 tests, 1 failure, 5.71 s]

comment:67 in reply to: ↑ 66 Changed 21 months ago by fbissey

Replying to vbraun:

sage -t --long src/sage/coding/information_set_decoder.py
**********************************************************************
File "src/sage/coding/information_set_decoder.py", line 553, in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
Failed example:
    A.time_estimate()/5 < avg and avg < A.time_estimate() * 5 # long time
Expected:
    True
Got:
    False
**********************************************************************
1 item had failures:
   1 of  15 in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
    [170 tests, 1 failure, 5.71 s]

I couldn't get that on my setup while repeating the test a 1000 time.

comment:68 Changed 21 months ago by jsrn

Hmm, we were also in doubt about that test, as you can see from the discussion above. I had believed that it was fairly stable now, but evidently timing behaviour on the build bots are not very dependable.

So perhaps we should simply remove the test - what do you think? Any alternative suggestions for how we could test the calibration code?

comment:69 follow-up: Changed 21 months ago by vbraun

  • Don't test that the timings have particular values
  • Do test that timings can be collected
  • Do test that your code makes the right decisions when you specify timings

comment:70 Changed 21 months ago by git

  • Commit changed from 50018f67b8c6c214f8156d21c68a504b493e772c to eb2efb3d03f135e4f163bf3b2554c2a8d47054ff

Branch pushed to git repo; I updated commit sha1. New commits:

eb2efb3Replace brittle calibration test with a test for proper selection

comment:71 in reply to: ↑ 69 Changed 21 months ago by jsrn

Replying to vbraun:

  • Don't test that the timings have particular values
  • Do test that timings can be collected
  • Do test that your code makes the right decisions when you specify timings

OK, I've pushed a patch removing the offending test and instead making a small refactor such that we can test the last item on your list.

It seems a pity, however, that there is no good way to test that the calibrated timings have anything to do with reality.

comment:72 Changed 21 months ago by jsrn

  • Status changed from needs_work to needs_review

comment:73 Changed 21 months ago by ylchapuy

  • Status changed from needs_review to positive_review

OK, let's do it that way (and the failure in the docbuild is unrelated)

LGTM

comment:74 Changed 21 months ago by vbraun

  • Branch changed from u/jsrn/information_set_decoding to eb2efb3d03f135e4f163bf3b2554c2a8d47054ff
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.