Opened 5 years ago
Closed 4 years ago
#20138 closed enhancement (fixed)
Information set decoding for linear codes
Reported by: | dlucas | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-8.0 |
Component: | coding theory | Keywords: | |
Cc: | jlavauzelle | Merged in: | |
Authors: | David Lucas, Johan Rosenkilde, Yann Laigle-Chapuy | Reviewers: | Yann Laigle-Chapuy |
Report Upstream: | N/A | Work issues: | |
Branch: | eb2efb3 (Commits, GitHub, GitLab) | Commit: | eb2efb3d03f135e4f163bf3b2554c2a8d47054ff |
Dependencies: | #19653 | Stopgaps: |
Description
This ticket proposes an implementation of Lee-Brickell algorithm refinement for the information set decoding.
It depends on #19653 for the reimplementation of C.random_element
.
Change History (74)
comment:1 Changed 5 years ago by
- Branch set to u/dlucas/information_set_decoding
comment:2 Changed 5 years ago by
- Commit set to 645e329a716a57f16e64c972f46779e4e0c1905f
comment:3 Changed 5 years ago by
- Commit changed from 645e329a716a57f16e64c972f46779e4e0c1905f to 0e1391d358018382db6cc0a59014ef934239987e
Branch pushed to git repo; I updated commit sha1. New commits:
6292df9 | window_size and number_errors are now class arguments
|
e72a5ab | Rewrote _repr_ and _latex_ methods, fixed a typo
|
f2c074c | decode_to_code now iterates over the whole range of provided errors
|
0e1391d | Added *args to Decoder/Encoder-related methods in AbstractLinearCode
|
comment:4 Changed 5 years ago by
- Status changed from new to needs_review
I made some changes:
window_size
andnumber_errors
(previouslyp
andw
) are no longer parameters ofdecode_to_code
but class arguments to provide at construction time.number_errors
can be provided as a tuple/list, and in that case, the decoder will iterate over it when trying to decode a word.
- I completely reworked the documentation, explained how the algorithm works and added a lot of doctests.
- I completed input sanitization with some extra checks.
- I fixed a stupid bug: only
**kwargs
was considered on Decoder/Encoder?-related methods fromAbstractLinearCode
, likedecoder()
. So,C.decoder('InformationSet', 2, 2)
was not working...
This is now open for review.
comment:5 Changed 5 years ago by
- Commit changed from 0e1391d358018382db6cc0a59014ef934239987e to 81927695b58d5e4543d4a59b9d1b2c4d730ddd35
Branch pushed to git repo; I updated commit sha1. New commits:
8192769 | Updated to 7.1beta6 and fixed conflicts
|
comment:6 Changed 5 years ago by
- Commit changed from 81927695b58d5e4543d4a59b9d1b2c4d730ddd35 to 924be79fc189c2a9c563b9bfbe240ad854c16101
comment:7 Changed 5 years ago by
- Milestone changed from sage-7.1 to sage-7.2
I updated this ticket to latest beta, fixed broken doctests, and did minor changes to the decoder (I refined its types and rewrote a line in the documentation). It's still open for review.
comment:8 Changed 5 years ago by
- Commit changed from 924be79fc189c2a9c563b9bfbe240ad854c16101 to 93066b139f579636a913f934e51046e7753cba8a
comment:9 Changed 5 years ago by
- Milestone changed from sage-7.2 to sage-7.3
I updated this ticket to 7.3b2, and fixed a bug. Still open for review.
Best,
David
comment:10 Changed 5 years ago by
Here are my two cents:
- you should use
itertools
instead ofSubsets
- you shouldn't enumerate the
information_set
but take them at random (otherwise you might get stuck a long time with bad choices) - why use
while True
loops instead of justfor
?
all in all, here is how I would do it (notations from the article you cite):
import itertools def LeeBrickell(r, C, w, p): G = C.generator_matrix() n = C.length() k = C.dimension() Fstar = list(C.base_ring())[1:] while True: # step 1. I = sample(range(n), k) Gi = G.matrix_from_columns(I) try: Gi_inv = Gi.inverse() except ZeroDivisionError: continue G = Gi_inv * G #step 2. y = r - vector([r[i] for i in I]) * G g = G.rows() #step 3. for A in itertools.combinations(range(k), p): for m in itertools.product(Fstar, repeat=p): e = y - sum(m[i]*g[A[i]] for i in range(p)) if e.hamming_weight() == w: return r - e
comment:11 Changed 5 years ago by
- Reviewers set to Yann Laigle-Chapuy
- Status changed from needs_review to needs_work
You should also check if the resulting word is in the code (it might not be the case if you try to decode beyond the minimal distance).
comment:12 Changed 5 years ago by
In case you want to optimize this a bit, you can replace the itertools
enumerations with sage.combinat.gray_codes
stuff (and if the field is GF(2)
get rid of the itertools.product
).
This is probably better kept for another ticket.
comment:13 Changed 4 years ago by
- Branch changed from u/dlucas/information_set_decoding to u/jsrn/information_set_decoding
comment:14 follow-up: ↓ 19 Changed 4 years ago by
- Cc jlavauzelle added
- Commit changed from 93066b139f579636a913f934e51046e7753cba8a to 25b2b9329104db756a02ae5ab52685e4ac853048
I've picked up this patch again: I've rebased it and polished it off but so far haven't changed the algorithm. This is not up for review yet!
Thanks, Yann, for your excellent suggestions which is the next thing I'll look into. I'm sorry you didn't get any response for so long.
I've changed the following things so far:
- Improved some of the documentation
- Change the doc-tests to use a specific code (Golay) instead of random ones.
- Changed the calling convention so
window_size
is optional (I have yet to compute a sensible default value, though). - Improved the error message from
AbstractLinearCode.encode/decode
when not satisfying the decoder's constructor. This came up since the information set decoder still has 1 mandatory argument. - Fixed the new decoder's types (according to the now better documented types #20001) and add some appropriate new ones.
- Fixed various errors in
sage/coding
that came up: some bad printing in Golay codes, removed old deprecated methodLinearCode.decode
.
My next step is to look at Yann's suggestion for improving the algorithm and it's efficiency.
New commits:
3b41dd5 | Rebase on 8.0.beta8. Roll back mods outside linear_code.py and thematic tutorial
|
0f8bdb3 | Went over documentation and simplified calling convention
|
57cf6b6 | Fixed Golay code Finite Field printing
|
cf556ab | Fix doctests
|
d021e1b | Remove `AbstractLinearCode.decode` which has been deprecated 18 months
|
43c229c | Helpful error messages for encoder/decoder construction with args
|
25b2b93 | decoder types: for new decoder, remove "unique", and fix some descriptions
|
comment:15 follow-up: ↓ 16 Changed 4 years ago by
No problem for the delay, I'm far from following this closely :)
A side remark though: _registered_decoders
is a dict
and decoders_available
returns its keys, so the ordering is not reliable for doctests. I would advocate always returning a sorted list as it's not a time critical method.
comment:16 in reply to: ↑ 15 Changed 4 years ago by
Replying to ylchapuy:
No problem for the delay, I'm far from following this closely :)
A side remark though:
_registered_decoders
is adict
anddecoders_available
returns its keys, so the ordering is not reliable for doctests. I would advocate always returning a sorted list as it's not a time critical method.
Good point - I'll fix this while we're at it.
comment:17 Changed 4 years ago by
- Commit changed from 25b2b9329104db756a02ae5ab52685e4ac853048 to 7551c478b8d37882dd2a12e93dc2bc55d5ff89c5
Branch pushed to git repo; I updated commit sha1. New commits:
7551c47 | Sort all doc-tests for decoders/encoders_available
|
comment:18 follow-up: ↓ 20 Changed 4 years ago by
Hi Johan,
I had a quick look at this ticket. For the moment, I didn't review the algorithm but I have a few remarks on the rest of the code.
While running doctests, I obtained 3 errors:
- in golay_code.py, line 169, a bracket misses in the doctest
- in decoder.py, lines 105 and 113. That's because the type
unique
has been removed from the decoders' types ofLinearCodeSyndromeDecoder
andLinearCodeNearestNeighborDecoder
(see in linear_code.py) but not from the doctest.
In fact, you seem to have removed the type unique
from every decoder_type
; is this because you consider that a decoder is a unique decoder as long as it isn't claimed not to be like that?
Besides, I have a more stuctural remak. Starting from Prange algorithm, there are many variants of the information set decoding algorithm (Lee-Brickell, birthday decoding, Stern-Dumer, MMT, BJMM), the last ones being quite recent. The name InformationSetDecoder
implicitly assumes we're planning to implement only one of them (Lee-Brickell). The question is: if someone wants to implement another variant, what should he do? I think there are two solutions:
- the class
InformationSetDecoder
will contain all the implemented variants of Prange algorithm (that is, for the moment only Lee-Brickell). If another algorithm apears in Sage, then it will be callable through an optional argument of the constructor, e.g.__init__(algo="LeeBrickell", **kwargs)
.
- We define a class per algorithm, and
InformationSetDecoder
is a pointer toLeeBrickellISD
.
In my humble opinion the second one is better (it is what we did for the various decoders of GRS codes), but I do not have time to rework the code during the following weeks.
Julien
comment:19 in reply to: ↑ 14 Changed 4 years ago by
Replying to jsrn:
- Change the doc-tests to use a specific code (Golay) instead of random ones.
It would be good to continue to test also non-binary codes (because we can!)
- Fixed various errors in
sage/coding
that came up: some bad printing in Golay codes, removed old deprecated methodLinearCode.decode
.
I understand it's easier, but isn't it best practice to have self contained tickets?
comment:20 in reply to: ↑ 18 Changed 4 years ago by
Hi Julien,
Thanks for the feedback.
Replying to jlavauzelle:
While running doctests, I obtained 3 errors:
Thanks, stupid mistakes. I'll fix it presently.
In fact, you seem to have removed the type
unique
from everydecoder_type
; is this because you consider that a decoder is a unique decoder as long as it isn't claimed not to be like that?
Yes. This finally became clear in #20001 when the types were documented better, though it was forgotten in #20001 to actually go over all decoders again and fix their types.
Besides, I have a more stuctural remak.
That's a good point. I would leave it to posterity to decide this. My own opinion is that a single class with an algorithm
flag is sufficient, since all the decoders have exactly the same overall behaviour (same types) and only differ in speed, in some sense. Several of the decoders for GRS codes are of very different type.
If someone comes along and implements, say Stern-Dumer, the choice can easily be made at that point.
Replying to ylchapuy:
Change the doc-tests to use a specific code (Golay) instead of random ones.
It would be good to continue to test also non-binary codes (because we can!)
Good point!
Fixed various errors in sage/coding that came up: some bad printing in Golay codes, removed old deprecated method LinearCode?.decode.
I understand it's easier, but isn't it best practice to have self contained tickets?
Yes it's best practice. My experience with the trac ticket-and-review system, however, is that if best practice is followed completely, then trivial mistakes will never get fixed. I'm tired of coming across small mistakes here and there while editing codes and not being able to fix them, and then deal with ticket dependencies etc. My impression is that many of the very active developers on SageMath do the same.
Removing the deprecated function is perhaps debatable, though.
comment:21 Changed 4 years ago by
Regarding the algorithm, it *really* should be corrected. Try e.g.:
sage: C = codes.random_linear_code(GF(3), 24, 10) sage: c = C.random_element() sage: Chan = channels.StaticErrorRateChannel(C.ambient_space(), 4) sage: y = Chan(c) sage: %time LeeBrickell(y, C, 4, 2) == c CPU times: user 16 ms, sys: 4 ms, total: 20 ms Wall time: 20.5 ms True sage: D = C.decoder("InformationSet", 4) sage: %time D.decode_to_code(y) <... still waiting ... >
comment:22 Changed 4 years ago by
- Commit changed from 7551c478b8d37882dd2a12e93dc2bc55d5ff89c5 to 46b0907b4a8a196dcc1dc8d0cdbe7a26b51bac37
comment:23 Changed 4 years ago by
Yep, the algorithm is broken which I've now seen trying it on a GF(3)
Golay code. I've pushed the example, but this is obviously still needs work.
As I mentioned earlier, going over the algorithm was next order of business anyway.
comment:24 Changed 4 years ago by
- Commit changed from 46b0907b4a8a196dcc1dc8d0cdbe7a26b51bac37 to a81785a709f1303b381535abd2208e73f2e97118
Branch pushed to git repo; I updated commit sha1. New commits:
a81785a | Use ylchapuy's implementation of Lee-Brickell
|
comment:25 Changed 4 years ago by
I've replaced the algorithm with Yann's implementation from Comment 10, with only trivial modifications. This one works on the GF(3)
doctest. I still want to do more testing and timing, so this is not up for review yet.
comment:26 Changed 4 years ago by
Note that I opened #23244 because I plan to implement Stern-Dumer.
comment:27 follow-up: ↓ 28 Changed 4 years ago by
That's awesome!
I'm currently working on calibrating the window parameter - I find this an integral part of the Lee--Brickell algorithm because it has a huge effect on the average running time (and a user would have no idea how to set it). Unfortunately, it's a bit hit and miss when I have time to work on this, but I'll try to get it wrapped up.
comment:28 in reply to: ↑ 27 ; follow-up: ↓ 29 Changed 4 years ago by
Replying to jsrn:
That's awesome!
I'm currently working on calibrating the window parameter - I find this an integral part of the Lee--Brickell algorithm because it has a huge effect on the average running time (and a user would have no idea how to set it). Unfortunately, it's a bit hit and miss when I have time to work on this, but I'll try to get it wrapped up.
First, I don't understand why you changed p
to win
. There is no notion of window here to me. Every paper I saw call this parameter p
and I think it would be better to stick to it, as w
is used for the weight of the expected error.
I'll use the notation n
,k
for the length and dimension, w
for the weight of the error and p
for the parameter of the algorithm.
Regarding the choice of this p
parameter, first the analysis is usually made for constant error weight correction (when tau
is a singleton) because it's easier. I'll do that too. Moreover analysis is usually done with a fixed p
(I mean no loop as in you line 5303), see e.g.
http://fms2.cerimes.fr/vod/media/canalu/documents/fuscia/3.5.lee.and.brickell.algorithm_32885/c015im.w3.s5.v3.pdf for the binary case. But I think it's worth doing the loop.
The expected cost C
is C_I / P_I
where C_I
is the cost of one iteration (from line 5289) and P_I
the probability of success of this iteration (the iterations are independant because we choose I
at random, so P_I
is a constant).
`
C_I
is G + x * V
with G
the cost of step 1 and 2 (mostly a Gaussian elimination), x
the number of error vector e
we compute and V
the cost to compute each of them (this depends on the associated p
in the current implementation but could be made constant if we later use e.g. gray codes to compute the sum).
Each vector e
we compute increases P_I
by a small amount E
which depends on the current p
, (the bigger the p
, the smaller the gain).
Overall the cost function is piecewise defined:
C(x) = (G + x*V) / (P(p) + x*E(p))
with P(p)
the success probability if we stop before p
[i.e P(p) = sum_{i=0}^{p-1} E(i) * binomial(k,i)*(q-1)**i
]
The sign of the derivative depends only on G*E(p) - V*P(p)
.
G
and V
are implementation dependant but you can estimate them, E
and P
can be computed.
You could do this computation when you construct the decoder to choose p
(it will be fast because p
will never get bigger than 3 I think).
I did my best but I'm not certain my explanations are crystal clear! Does this makes sense to you?
comment:29 in reply to: ↑ 28 ; follow-ups: ↓ 30 ↓ 31 Changed 4 years ago by
Replying to ylchapuy:
First, I don't understand why you changed
p
towin
. There is no notion of window here to me. Every paper I saw call this parameterp
and I think it would be better to stick to it, asw
is used for the weight of the expected error.
I think that a name is more evocative than a letter, in spite of tradition in mathematics text. "Window" was the word David used in the original implementation, and I couldn't come up with a substantially better word. It also has a nice, clear 3-letter abbreviation. Using w
for error weight is not more traditional than e.g. e
, t
or tau
in the greater decoding literature. I changed it since it visually collides with win
, so I used tau
in my implementation. I believe it crops up elsewhere in sage/coding
since I commonly use this this.
I'll use the notation
n
,k
for the length and dimension,w
for the weight of the error andp
for the parameter of the algorithm. ...
Thanks. I already implemented all of this a few days ago, exactly using the equations you write. Where I left it then was at the estimates of E
and P
which have to be devilishly precise: I've got E
down well, but the code simplification I did when estimating P
turned out to be too coarse (I believed it depended mostly only on the cost of scaling and summing vectors, but apparently the hamming_weight
and loop structures add something). Testing this takes time because I need to run the decoder many times on a non-trivial code to be sure things work. I'll post my test code here as well, which might be useful for ou in #23244.
I'll look at it again now and push what I have in any case.
comment:30 in reply to: ↑ 29 ; follow-up: ↓ 32 Changed 4 years ago by
Replying to jsrn:
Replying to ylchapuy:
First, I don't understand why you changed
p
towin
. There is no notion of window here to me. Every paper I saw call this parameterp
and I think it would be better to stick to it, asw
is used for the weight of the expected error.I think that a name is more evocative than a letter, in spite of tradition in mathematics text.
as a parameter yes
"Window" was the word David used in the original implementation, and I couldn't come up with a substantially better word.
Can you explain me what window this would be? I really don't get it.
I propose partial_weight
, this is the number of error positions we expect in our information set (and is abbreviated as p
).
It also has a nice, clear 3-letter abbreviation. Using
w
for error weight is not more traditional than e.g.e
,t
ortau
in the greater decoding literature. I changed it since it visually collides withwin
, so I usedtau
in my implementation. I believe it crops up elsewhere insage/coding
since I commonly use this this.
You cite a paper in the doctest. I always find it nicer to use the same notations in the implementation as in the paper, it makes life easier for anyone trying to understand what your code does.
But in the end I guess it's your call, I won't fight about this.
comment:31 in reply to: ↑ 29 Changed 4 years ago by
Replying to jsrn:
Replying to ylchapuy:
I'll use the notation
n
,k
for the length and dimension,w
for the weight of the error andp
for the parameter of the algorithm. ...Thanks. I already implemented all of this a few days ago, exactly using the equations you write. Where I left it then was at the estimates of
E
andP
which have to be devilishly precise: I've gotE
down well, but the code simplification I did when estimatingP
turned out to be too coarse (I believed it depended mostly only on the cost of scaling and summing vectors, but apparently thehamming_weight
and loop structures add something).
E(p)
should be binomial(n-k, w-p) / (binomial(n,w) * (q-1)**p)
and I wrote the formula for P(p)
.
I guess you mean estimating G
and V
. They don't need to be that precise. If they're a bit wrong, the final cost estimate will only be a bit wrong, nothing catastrophic should happen.
Testing this takes time because I need to run the decoder many times on a non-trivial code to be sure things work. I'll post my test code here as well, which might be useful for ou in #23244.
I don't want to push you, I only like to write down my thoughts, you may ignore them. I know this takes time and we all have other things to do!
comment:32 in reply to: ↑ 30 ; follow-up: ↓ 33 Changed 4 years ago by
Replying to ylchapuy:
I propose
partial_weight
, this is the number of error positions we expect in our information set (and is abbreviated asp
).You cite a paper in the doctest. I always find it nicer to use the same notations in the implementation as in the paper, it makes life easier for anyone trying to understand what your code does.
That's a valid reason. The cited paper is not the original contribution and the notation looked ad hoc to me, so I didn't heed it in this instance. But I'm not well-versed in ISD literature, and if you say the parameter is often p
that's a stronger argument. Did you come up with the partial_weight
or did you see that somewhere? It sounds bland to me, but it does share the p
.
comment:33 in reply to: ↑ 32 Changed 4 years ago by
Replying to jsrn:
Replying to ylchapuy:
I propose
partial_weight
, this is the number of error positions we expect in our information set (and is abbreviated asp
).You cite a paper in the doctest. I always find it nicer to use the same notations in the implementation as in the paper, it makes life easier for anyone trying to understand what your code does.
That's a valid reason. The cited paper is not the original contribution and the notation looked ad hoc to me, so I didn't heed it in this instance. But I'm not well-versed in ISD literature, and if you say the parameter is often
p
that's a stronger argument.
It's not used in the original publication (there it's j
) but it's used at least since Stern's paper in 1989, in all the papers proposing improvements (and so is w
for the expected weight of the error).
Did you come up with the
partial_weight
or did you see that somewhere? It sounds bland to me, but it does share thep
.
I just came up with this, but I agree it's not particularly good. I don't know any wordy definition of this parameter.
comment:34 Changed 4 years ago by
- Commit changed from a81785a709f1303b381535abd2208e73f2e97118 to 476369c26dde394949e074270181b4fa7a652bd8
comment:35 Changed 4 years ago by
I've pushed an attempt at writing the calibration of the search size parameter
(which I call it now, abbreviated p
). I monkey'd around for a very long while
thinking it didn't work, but I must have confused myself terribly, because when
I tried to clean everything up in order to write this report on my progress, it
suddenly doesn't seem too bad. I believe one gotcha that confounded me was the
inclusion of a rho
factor: the fraction of size-k
-positions that are
information sets.
Anyway, here are some timings:
[24, 12, 8] Extended Golay code over GF(2) Estimated Actual p=0 0.00294093602195062 0.00370841026306 p=1 0.000733640004873450 0.00105549812317 p=2 0.000779351222057502 0.000777833461761 p=3 0.00202210000281411 0.00085108757019 [12, 6, 6] Extended Golay code over GF(3) Estimated Actual p=0 0.00107591199443414 0.00152409553528 p=1 0.000475508233346987 0.000670518875122 p=2 0.000837919998575671 0.000634233951569 [255, 231] BCH Code over GF(2) with designed distance 7 Estimated Actual p=0 8.27393653065493 10.0903596711 p=1 0.303782262811033 0.38420484066 p=2 0.613038331151469 0.627587425709 p=3 13.5825719377659 6.28117632627 [255, 139] BCH Code over GF(2) with designed distance 31 Estimated Actual [255, 47] BCH Code over GF(2) with designed distance 81 Estimated Actual p=0 7.08824734720053 6.80084561825 p=1 0.730343868369564 0.871155004501 p=2 0.776726300346984 0.963895459175 p=3 3.16273106617629 4.46240173101 p=4 14.6264398618229 16.1132382083 [80, 56] BCH Code over GF(3) with designed distance 9 Estimated Actual p=0 0.652945537606786 0.400583426952 p=1 0.104725869935881 0.0847134900093 p=2 0.651413542048857 0.519779400826 p=3 11.2044949381990 5.3165213418 [80, 15] BCH Code over GF(3) with designed distance 41 Estimated Actual p=0 0.0524764488533255 0.0787803530693 p=1 0.0235392606723455 0.0252663660049 p=2 0.0786084509854239 0.0758757901192 p=3 0.340890763500404 0.248177850246 p=4 1.46166863051819 0.734001851082 p=5 5.69338143224234 1.27127592802 p=6 19.6762218536446 2.85639382839 [124, 94] BCH Code over GF(5) with designed distance 13 Estimated Actual p=0 51.8959970988144 (25.518362093 last time, skipped here) p=1 8.16363820890485 13.5431237435
Here follows some of my test code:
def test_decoder(D, reps=100): r"""Return the time it takes to run the decoder on average, with maximal number of errors.""" C = D.code() tau = D.decoding_radius() Chan = channels.StaticErrorRateChannel(C.ambient_space(), tau) tsum = 0 for i in range(reps): c = C.random_element() r = Chan(c) before = time.clock() cout = D.decode_to_code(r) after = time.clock() assert c == cout , "decoding error" tsum += after - before return tsum/reps from sage.coding.linear_code import LinearCodeInformationSetDecoder test_codes = [ ( 3, codes.GolayCode(GF(2))), ( 2, codes.GolayCode(GF(3))), ( 3, codes.BCHCode(GF(2), 2^8-1, 7)), (15, codes.BCHCode(GF(2), 2^8-1, 31)), (40, codes.BCHCode(GF(2), 2^8-1, 81)), ( 4, codes.BCHCode(GF(3), 3^4-1, 9)), (20, codes.BCHCode(GF(3), 3^4-1, 41)), ( 6, codes.BCHCode(GF(5), 5^3-1, 13)) ] for (tau, C) in test_codes: print C D = LinearCodeInformationSetDecoder(C, tau) estimates = D._calibrate_search_size() print "\t\tEstimated\t\tActual" for p in range(0,tau+1): if estimates[p] < 30: D._search_size = p avg = test_decoder(D, reps=5) else: avg = "skipped" print "\tp=%s\t%s\t%s" % (p, estimates[p], avg) print def test_iters(D): r"""Return the avg number of iterations taken by the decoder, including those where a non-information set is picked. This requires debug code inserted to count `self._iters` in the decoder. """ N = 101 iters = [] for i in range(N): test_decoder(D, reps=1) iters.append(D._iters) print median(iters) , 1. * sum(iters) / N def infoset_density(C, reps=1000): r"""Estimate density of information sets""" s = 0 G = C.generator_matrix() n, k = C.length(), C.dimension() for N in range(reps): I = sample(range(n), k) Gi = G.matrix_from_columns(I) try: Gi_inv = Gi.inverse() s += 1 except ZeroDivisionError: pass return 1. * s / (N+1) for _,C in test_codes: print "%s\t%s" % (C, infoset_density(C))
New commits:
d6f69b8 | Rename number_errors() method to decoding_interval() and decoding_radius() method
|
a074919 | Calibrate window parameter, take 1
|
7ab7277 | Calibrate window parameter, take 2
|
476369c | Rename window to search_size, win -> p, w -> p or pi
|
comment:36 follow-up: ↓ 37 Changed 4 years ago by
Great, I'll take a closer loop later. Here are some preliminary remarks:
- you shouldn't worry about
rho
, consider as an iteration only those with a valid information set. It's already what you measure withtime_information_set_steps
, though I would take the mean and not the median here.
- beware,
_search_size
is initialized toNone
which leads to trouble. You should just leave it uninitialized.
comment:37 in reply to: ↑ 36 ; follow-up: ↓ 39 Changed 4 years ago by
I updated the timings after a longer run with more iterations.
Replying to ylchapuy:
Great, I'll take a closer loop later. Here are some preliminary remarks:
- you shouldn't worry about
rho
, consider as an iteration only those with a valid information set. It's already what you measure withtime_information_set_steps
, though I would take the mean and not the median here.
I disagree: it changes the balance between the inversion step (done always) and the search step (done a rho
fraction of the time). Perhaps it's not paramount, but I've already implemented it, and it makes debugging nice since the number of iterations matches extremely well.
- beware,
_search_size
is initialized toNone
which leads to trouble. You should just leave it uninitialized.
Thanks, that's a bug: I put the call to calibration in search_size()
which is only called if _search_size
is unset.
comment:38 Changed 4 years ago by
- Commit changed from 476369c26dde394949e074270181b4fa7a652bd8 to fa7240261271df724b145d6a67433cb007c143e3
comment:39 in reply to: ↑ 37 ; follow-up: ↓ 40 Changed 4 years ago by
Replying to jsrn:
I updated the timings after a longer run with more iterations.
Replying to ylchapuy:
Great, I'll take a closer loop later. Here are some preliminary remarks:
- you shouldn't worry about
rho
, consider as an iteration only those with a valid information set. It's already what you measure withtime_information_set_steps
, though I would take the mean and not the median here.I disagree: it changes the balance between the inversion step (done always) and the search step (done a
rho
fraction of the time). Perhaps it's not paramount, but I've already implemented it, and it makes debugging nice since the number of iterations matches extremely well.
I disagree as well! What time_information_set_steps
measures is the time to get a valid information set. The while True
is included into it.
As is your value T
is off by a factor rho
.
comment:40 in reply to: ↑ 39 Changed 4 years ago by
Replying to ylchapuy:
I disagree as well! What
time_information_set_steps
measures is the time to get a valid information set. Thewhile True
is included into it. As is your valueT
is off by a factorrho
.
Oh yes, now I see, you're right. And changing the medians to means then it will probably be precise enough. The alternative is to let time_information_set_steps
measure only the time once an information set is found. If e.g. q=7
then the probability of an information set (on a random code) is roughly 5/6
so there will be a fair chance that each of the 5 runs of time_information_set_steps
hit an information set on the first guess - and then we will have underestimated the average time by a factor 6/5
. Which likely doesn't matter at all.
comment:41 Changed 4 years ago by
- Commit changed from fa7240261271df724b145d6a67433cb007c143e3 to ea5c36c00c1043ee72ca2837da96e404bf4c8907
Branch pushed to git repo; I updated commit sha1. New commits:
ea5c36c | rho is already included. Mean instead of median
|
comment:42 Changed 4 years ago by
I removed the rho
factor and updated the documentation.
comment:43 Changed 4 years ago by
This looks nicer. Another thing I that bothers me is the search_size
method.
This parameter is (sort of) specific to the Lee-Brickell algorithm, not to ISD. In preparation for other variants available (e.g. Stern-Dumer), I would prefer to hide this from the user.
I would either hide it behind an "underscored" method, or provide a more general one, e.g.
def parameters(self): r""" Return the algo and parameters used """ return ("LeeBrickell", self._search_size)
or
def parameters(self): r""" Return the parameters selected for each available algorithms """ return { "LeeBrickell": self._search_size }
or anything better you can think of.
comment:44 follow-up: ↓ 45 Changed 4 years ago by
This is essentially the issue that Julien brought up earlier: at what level, and how, should the different ISD implementations differ. I had proposed to defer this question till when it came up, but considering #23244 we might just as well decide something now. Julien's suggestion is to have entirely different classes. I suggested that shoving them all in the same class could be sufficient. But I'm not an expert in the other ISD algorithms, and as you point out, they take different parameters.
Your suggestion would allow for the necessary flexibility, but is essentially an indirection put in place to avoid subclassing. I don't find it particularly elegant, and it would be surprising to users I think. Granted, a typical user wouldn't care about the parameters.
Consider instead having a class for each, taking and naming their parameters as natural to each, and all subclassing a AbstractInformationSetDecoder
. Then we make a function:
def LinearCodeInformationSetDecoder(code, number_errors, algorithm=None, *args, **kwargs): if algorithm == None: algorithm = "LeeBrickell" if code.dimension() < 10 else "SternDumer" if algorithm == "LeeBrickell": return LinearCodeISLeeBrickellDecoder(code, number_errors, *args, **kwargs) ...
Preferably improved with a try ... except
which gives a nicer error message in case args, kwargs
is not proper. Probably, supplying args
or kwargs
should also be disallowed if algorithm
is not explicitly set, in order to avoid user code silently breaking in case we sometime decide to change the automatic selection of algorithm
.
comment:45 in reply to: ↑ 44 Changed 4 years ago by
Replying to jsrn:
This is essentially the issue that Julien brought up earlier: at what level, and how, should the different ISD implementations differ. I had proposed to defer this question till when it came up, but considering #23244 we might just as well decide something now. Julien's suggestion is to have entirely different classes. I suggested that shoving them all in the same class could be sufficient. But I'm not an expert in the other ISD algorithms, and as you point out, they take different parameters.
I agree it's not elegant. My aim was to make this ticket as easy to close as possible. I don't want to make everything perfect, but just don't want to provide user functions we would have to deprecate in the future. As long as we have only underscored methods it's not a problem to me and we can decide later what is the best way to deal with this.
If you want to decide this now, I also think some general factory LinearCodeInformationSetDecoder
choosing the best implementation at hand for those parameters, and specialized classes for users who know what they are doing seems the way to go for me.
comment:46 Changed 4 years ago by
- Commit changed from ea5c36c00c1043ee72ca2837da96e404bf4c8907 to 5463b1d86c8b813f3c608c69d2af9674eb7c392f
comment:47 follow-up: ↓ 48 Changed 4 years ago by
I've implemented a class hierarchy and a factory function as described earlier. Since the ISD code appears to get lengthy, I've moved it to a new file.
Doc-tests pass, but I'm having trouble with documentation. I haven't inspected the compiled html doc yet because I got stuck in attempting to make a proper hyperlink for sage.coding.isd_decoding.LinearCodeInformationSetDecoder
in the module doc of sage.coding.decoder_catalog
. I'll look at it again soon.
comment:48 in reply to: ↑ 47 ; follow-up: ↓ 49 Changed 4 years ago by
Replying to jsrn:
I've implemented a class hierarchy and a factory function as described earlier. Since the ISD code appears to get lengthy, I've moved it to a new file.
Great, I agree it's a nice thing to put it in a separate file. Though the d
in isd
is for decoding, so isd_decoding
is a bit strange.
I look forward to review this when it' s ready, thanks for working on this!
comment:49 in reply to: ↑ 48 ; follow-up: ↓ 50 Changed 4 years ago by
Great, I agree it's a nice thing to put it in a separate file. Though the
d
inisd
is for decoding, soisd_decoding
is a bit strange.
True. But isd
is a very bland abbreviation and I suspect it will be hard for most people to look at the file name and have an idea of its contents. And is_decoding
is weird. Do you have a suggestion?
Similarly, I had great difficulty coming up with a proper name for the class LinearCodeISD_LeeBrickell
. All other decoder classes end with Decoder
. But LinearCodeInformationSetLeeBrickellDecoder
is crazy long. And LinearCodeISLeeBrickellDecoder
reads weirdly since IS
doesn't remind of ISD
. And I wanted all ISD
-decoders to tab-complete similarly, so LinearCodeLeeBrickellISDecoder
was out.
comment:50 in reply to: ↑ 49 Changed 4 years ago by
Replying to jsrn:
Great, I agree it's a nice thing to put it in a separate file. Though the
d
inisd
is for decoding, soisd_decoding
is a bit strange.True. But
isd
is a very bland abbreviation and I suspect it will be hard for most people to look at the file name and have an idea of its contents. Andis_decoding
is weird. Do you have a suggestion?
why not information_set_decoding
we have tab completion.
Similarly, I had great difficulty coming up with a proper name for the class
LinearCodeISD_LeeBrickell
. All other decoder classes end withDecoder
. ButLinearCodeInformationSetLeeBrickellDecoder
is crazy long. AndLinearCodeISLeeBrickellDecoder
reads weirdly sinceIS
doesn't remind ofISD
. And I wanted allISD
-decoders to tab-complete similarly, soLinearCodeLeeBrickellISDecoder
was out.
Hum, I see. What would you think about only one full fledge decoder LinearCodeInformationSetDecoder
, and lightweight classes for the algorithms, e.g. (just sketching a solution):
class DecodingAlgorithm: """ Just providing an interface. I can use this bland name because it would in fact be sage.coding.information_set_decoding.DecodingAlgorithm """ @staticmethod def decode(G, y, **kargs): """ kargs depends on the actual algorithm """ raise NotImplementedError @staticmethod def cost_estimate(C, **kargs): """ Return ETA in seconds for the given code and parameters """ raise NotImplementedError @staticmethod def calibrate_for(C): """ Return the best parameters found for this code, as a dict You can thus do: A.decode(C.generator_matrix(), y, **A.calibrate_for(C)) """ raise NotImplementedError class AlgorithmLeeBrickell(DecodingAlgorithm): # Todo class AlgorithmSternDummer(DecodingAlgorithm): # Todo # etc
comment:51 Changed 4 years ago by
- Commit changed from 5463b1d86c8b813f3c608c69d2af9674eb7c392f to f68ea694a888638a1a839bd429a84ace1bb3a12e
comment:52 Changed 4 years ago by
Finally got back to it again. I had originally considered light-weight algorithm classes but then rejected it since the surrounding ISD-class would be almost trivial. However, reconsidering I think it actually makes for a slightly nicer design, user-wise: we avoid a decoder factory (which won't behave as a decoder class which might surprise the user), it sidesteps the clumsy naming, and it becomes more transparent for a user how to make new ISD algorithms.
So I've implemented your suggestion. I went for real instantiated Algorithm-objects. This makes the algorithms more accessible to the user, and making things static would also pose problems with calibrate
and time_estimate
, which either both would have to actually run the calibration or write to static class variables or something.
As usual this took 30 minutes, and then 1,5 hours to redo the doc-strings.
comment:53 Changed 4 years ago by
- Status changed from needs_work to needs_review
Needs review. I've added Yann as coauthor - he definitely deserves that!
comment:54 follow-up: ↓ 56 Changed 4 years ago by
First review pass.
To begin with:
- why give the copyright to William Stein? the format is also a bit off (see http://doc.sagemath.org/html/en/developer/coding_basics.html#headings-of-sage-library-code-files)
- you should import as much as possible from
sage.all
instead of hardcoding the whole path (e.g.sage.modules.free_module_element
) it's more future proof if things ever move around. - I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")
- if you ask for
time_estimate
with user provided parameters, it currently do the calibration and change the parameters. Is this expected? In my mind,calibrate
callstime_estimate
with different parameters to choose the best one.
For the algorithm itself:
- the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than
p
- in step 2, you should check if
y
is low weight. It actually has a good chance to be! (and you take it into account in your probability computation) - you could use (but this doesn't change much)
for gA in itertools.combinations(g, pi): for m in itertools.product(Fstar, repeat=pi): e = y - sum(mi*gAi for mi,gAi in zip(m,gA))
- prefer
... is not None
tonot ... is None
For the decoder:
- in
__init__
you doctest only some exceptions and not all of them - you also don't doctest a user defined algorithm (as allowed by your test
isinstance(algorithm, InformationSetAlgorithm
) - in
known_algorithms
I would maked
a class variable_known_algorithms
And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20)
and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).
As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).
comment:55 Changed 4 years ago by
- Commit changed from f68ea694a888638a1a839bd429a84ace1bb3a12e to c6ddf4318951e5619d5f0dfa1b91ddfb6e8a9f3e
Branch pushed to git repo; I updated commit sha1. New commits:
ee4d586 | Merge branch 'u/jsrn/information_set_decoding' of git://trac.sagemath.org/sage into 20138_information_set_decoding
|
ed4bfd7 | copyright and import statements
|
a4e7f49 | time_estimate() doesn't poop on selected parameters
|
339c383 | Fixed bugs and added doctests pointed out by reviewer.
|
0ea73d6 | Make _known_algorithms a class field
|
08a9e50 | Added a big example/doctest. Polished some existing ones.
|
c6ddf43 | Added a test for the precision of calibration
|
comment:56 in reply to: ↑ 54 ; follow-up: ↓ 57 Changed 4 years ago by
Thanks for the speedy review. Here is my less speedy reply.
Replying to ylchapuy:
- why give the copyright to William Stein? the format is also a bit off (see http://doc.sagemath.org/html/en/developer/coding_basics.html#headings-of-sage-library-code-files)
I hadn't looked up those rules to be honest. I added our three names. I don't know your email address though.
- you should import as much as possible from
sage.all
instead of hardcoding the whole path (e.g.sage.modules.free_module_element
) it's more future proof if things ever move around.
Didn't know that was possible. It's not used many places in Sage either. I don't see any drawbacks though, so I've made the change.
- I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")
That's a matter of style. I would always write "Return the minimum distance of this code" rather than "Return the minimum distance" or "Return the minimum distance of self". Similarly, decoding is not canonical: the result depends on the method used.
There was a discussion on sage-devel on this recently. There was no clear consensus, but many seemed favoured the "of this code"-style.
I'll keep it like this for now, but feel free to give explicit examples where you find it jarring.
- if you ask for
time_estimate
with user provided parameters, it currentlydo the calibration and change the parameters. Is this expected? In my mind,
calibrate
callstime_estimate
with different parameters to choose the best one.
Good catch: time_estimate
of course shouldn't overwrite selected parameters.
I've fixed that.
- the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than
p
No, we let pi
go from 0
to p
, both inclusive. What differs is that I, for
brevity, just say we let the coefficients roam through F^p
, while we actually
go through F*^pi
, and let pi
go from 0 to p
. This is to avoid repeating
the elements with multiple 0s in the first version.
- in step 2, you should check if
y
is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)
This is done in the iteration pi = 0
.
- you could use (but this doesn't change much)
for gA in itertools.combinations(g, pi): for m in itertools.product(Fstar, repeat=pi): e = y - sum(mi*gAi for mi,gAi in zip(m,gA))
Yes I could, but I don't consider it more elegant. If I was writing in OCaml I'd do it in the above way, obviously, but not in Python.
- prefer
... is not None
tonot ... is None
OK
For the decoder:
- in
__init__
you doctest only some exceptions and not all of them
I think all __init__
s and class docs now have the appropriate doctests.
- you also don't doctest a user defined algorithm (as allowed by your test
isinstance(algorithm, InformationSetAlgorithm
)
Thank you for insisting on a doctest of this: it revealed a bug and a design dilemma. The bug was that InformationSetAlgorithm
didn't hash, which meant that AbstractLinearCode.encoder
crashed since it tries to hash any arguments. This is potentially a problem with AbstractLinearCode.encoder
which might need to be fixed at some point, but for this case I don't see a reason not to just supply a simple hash function for InformationSetAlgorithm
.
The design dilemma is that when supplying an ISD algorithm, it is superfluous to also supply number_errors
to InformationSetDecoder
, since the algorithm knows its decoding interval. I thought about this and decided to leave it this way, i.e. the user must supply number_errors
and algorithm
and number_errors
must match algorithm.decoding_interval()
. My argument is that this use case is very rare and for special developer-users; but allowing the shortcut of only supplying algorithm
would confuse the documentation and mess up the constructor.
- in
known_algorithms
I would maked
a class variable_known_algorithms
No problem.
And finally, I would test this on bigger examples, e.g.
codes.random_linear_code(GF(9), 40, 20)
and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).
I don't like to use a random code, since it might fail to decode (granted, with very low probability). I've added an example with the [59, 30, 17]
QR code over GF(3)
, for which syndrome decoding or nearest neighbor decoding is infeasible. The current information-set decoding is extremely fast on the other hand. I've flagged it as long, though it takes less than 0.3 s.
As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).
A random doc-test should only fail with extremely low probability, otherwise it will cause headaches for other Sage devs and the build bot. The decoding time for information-set decoding is roughly exponentially distributed and has a huge variance. Thus we must average a large number of decoding trials and test this against the calibrated value.
I've implemented such a test for a medium-sized code, where we do 100 iterations, and I test that the elapsed time is within a factor 5 of calibrated. I did a 1-hour trial on my own computer with only light background processes, and factor 2 off calibrated time occurred after 400 trials while factor 3 off calibrated time occurred after 5000 trials:
from sage.coding.information_set_decoder import LeeBrickellISDAlgorithm C = codes.QuadraticResidueCode(37, GF(3)) Chan = channels.StaticErrorRateChannel(C.ambient_space(), 4) A = LeeBrickellISDAlgorithm(C, (0,4)) A.calibrate() def time_100(): A.calibrate() import time zero = C.ambient_space().zero() before = time.clock() for i in range(100): A.decode(Chan(zero)); None return time.clock() - before smallest = 1 biggest = 1 i = 0 while True: i += 1 elapsed = time_100() frac = elapsed/A.time_estimate()/100 if frac < smallest: smallest = frac if frac > biggest: biggest = frac print "%d: %s, %s" % (i, smallest , biggest)
My theoretical musings on these probabilities indicated that - if the calibration is sound - this should occur far less frequently, however.
Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.
Any thoughts?
(The ticket now needs review again)
comment:57 in reply to: ↑ 56 ; follow-up: ↓ 59 Changed 4 years ago by
Replying to jsrn:
Thanks for the speedy review. Here is my less speedy reply.
- you should import as much as possible from
sage.all
instead of hardcoding the whole path (e.g.sage.modules.free_module_element
) it's more future proof if things ever move around.Didn't know that was possible. It's not used many places in Sage either. I don't see any drawbacks though, so I've made the change.
I thought at some point it was the way to go, but it's not used much as you say. I'm not so sure now but I like it that way :)
- I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")
That's a matter of style. I would always write "Return the minimum distance of this code" rather than "Return the minimum distance" or "Return the minimum distance of self". Similarly, decoding is not canonical: the result depends on the method used.
There was a discussion on sage-devel on this recently. There was no clear consensus, but many seemed favoured the "of this code"-style.
I'll keep it like this for now, but feel free to give explicit examples where you find it jarring.
No keep it like that, I realized afterwards that there was no consensus on this.
- the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than
p
No, we let
pi
go from0
top
, both inclusive. What differs is that I, for brevity, just say we let the coefficients roam throughF^p
, while we actually go throughF*^pi
, and letpi
go from 0 top
. This is to avoid repeating the elements with multiple 0s in the first version.
OK, that's the problem with speedy review, I read too fast sorry.
- in step 2, you should check if
y
is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)This is done in the iteration
pi = 0
.
agreed
- you could use (but this doesn't change much)
for gA in itertools.combinations(g, pi): for m in itertools.product(Fstar, repeat=pi): e = y - sum(mi*gAi for mi,gAi in zip(m,gA))Yes I could, but I don't consider it more elegant. If I was writing in OCaml I'd do it in the above way, obviously, but not in Python.
It seems also a bit faster in my tests, but you can keep it that way if you don't like this.
For the decoder:
- in
__init__
you doctest only some exceptions and not all of themI think all
__init__
s and class docs now have the appropriate doctests.
Not all, e.g. raise ValueError("All elements of number_errors have to be" ...
and raise ValueError("The provided number of errors should be at"...
are missing I think.
- you also don't doctest a user defined algorithm (as allowed by your test
isinstance(algorithm, InformationSetAlgorithm
)Thank you for insisting on a doctest of this: it revealed a bug and a design dilemma. The bug was that
InformationSetAlgorithm
didn't hash, which meant thatAbstractLinearCode.encoder
crashed since it tries to hash any arguments. This is potentially a problem withAbstractLinearCode.encoder
which might need to be fixed at some point, but for this case I don't see a reason not to just supply a simple hash function forInformationSetAlgorithm
.The design dilemma is that when supplying an ISD algorithm, it is superfluous to also supply
number_errors
toInformationSetDecoder
, since the algorithm knows its decoding interval. I thought about this and decided to leave it this way, i.e. the user must supplynumber_errors
andalgorithm
andnumber_errors
must matchalgorithm.decoding_interval()
. My argument is that this use case is very rare and for special developer-users; but allowing the shortcut of only supplyingalgorithm
would confuse the documentation and mess up the constructor.
That's one of the reason my Algorithm classes were more light-weight than yours. They knew nothing and were pure functions accepting arguments. The decoder was the one knowing the parameters used, avoiding the duplication.
I think it's a better design, but I won't fight for this.
- in
known_algorithms
I would maked
a class variable_known_algorithms
No problem.
And finally, I would test this on bigger examples, e.g.
codes.random_linear_code(GF(9), 40, 20)
and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).I don't like to use a random code, since it might fail to decode (granted, with very low probability). I've added an example with the
[59, 30, 17]
QR code overGF(3)
, for which syndrome decoding or nearest neighbor decoding is infeasible. The current information-set decoding is extremely fast on the other hand. I've flagged it as long, though it takes less than 0.3 s.
You could choose a random seed such that everything goes well. But I'm fine with your QR code.
As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).
[...]
Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.
If It's not clear how to make it right, maybe it's better left for another ticket?
For an improvement in the calibration, I think I can improve things both in precision and clarity! As we want to empirically get the timing, we don't need to separate the time_information_set_steps and time_search_loop. You could just add some options:
def decode(self, r, max_iter=None, _calibrate=False): ... i = 0 while i != max_iter: i += 1 ... if not _calibrate: return r - e # when we calibrate, never return
This would end up with less code duplication and a more precise estimate. Once again, feel free to leave this to another ticket if you want, it wouldn't break anything to change this later.
Any thoughts?
If one chooses anything above
d/2, the algorithm does not guarantee to return a nearest codeword.
I would add we don't even guarantee to return a codeword!
(The ticket now needs review again)
I still have to check the rendering of the documentation, but this get's close to a positive review from me.
comment:58 Changed 4 years ago by
- Commit changed from c6ddf4318951e5619d5f0dfa1b91ddfb6e8a9f3e to 5924a39cd0c5a4bb8d44d47f48deb2add1ac330b
comment:59 in reply to: ↑ 57 Changed 4 years ago by
Replying to ylchapuy:
I think all
__init__
s and class docs now have the appropriate doctests.Not all, e.g.
raise ValueError("All elements of number_errors have to be" ...
andraise ValueError("The provided number of errors should be at"...
are missing I think.
You're right, sorry. I went over the entire file more carefully now.
That's one of the reason my Algorithm classes were more light-weight than yours. They knew nothing and were pure functions accepting arguments. The decoder was the one knowing the parameters used, avoiding the duplication.
I think it's a better design, but I won't fight for this.
OK, let's not debate this further. If you feel strongly about it when you implement Stern-Dumer, you can implement the change there.
As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).
[...]
Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.
If It's not clear how to make it right, maybe it's better left for another ticket?
Possibly. I generally don't see how tests of calibration and threshold selection algorithms fit into Sage's doctesting ecology.
I slightly improved my solution from before (run multiple calibrations). I feel pretty confident that the test there now is better than nothing and should not really cause false negatives. If you don't feel comfortable with it we can also remove it again.
For an improvement in the calibration, I think I can improve things both in precision and clarity!
I see your idea. I originally wanted to avoid this for two reasons: 1) it pollutes the code of the real algorithm; and 2) it adds unnecessary logic to a busy inner-loop. Probably 2) is totally negligible though, and 1) is true but the alternative is code duplication.
Let's wait with this for another ticket. I'm waiting to see how Stern-Dumer performs in practice before spending lots of time optimising Lee-Brickell (I have other improvements in mind as well).
Keep in mind for your solution though, that you also need some mechanism to
avoid large values of p
where a single iteration would take very long.
If one chooses anything above
d/2, the algorithm does not guarantee to return a nearest codeword.
I would add we don't even guarantee to return a codeword!
What do you mean? The algorithm might not return, but if it does it surely returns a codeword. That it might not return should be covered by the red warning box below.
I still have to check the rendering of the documentation, but this get's close to a positive review from me.
Great.
New commits:
1557ce9 | *args should pass to decoder in `AbstractLinearCode.decode_to_code/message`
|
395ce38 | Tests for all raised exceptions
|
5924a39 | Improved calibration doc test
|
comment:60 Changed 4 years ago by
two low hanging fruits, the bot complains about
- a doctest failure in
src/doc/en/constructions/linear_codes.rst
- a missing doctest:
Missing doctests in coding/information_set_decoder.py
: 23 / 24 = 95%
it' s the __hash__
.
comment:61 Changed 4 years ago by
- Status changed from needs_review to needs_work
comment:62 Changed 4 years ago by
- Commit changed from 5924a39cd0c5a4bb8d44d47f48deb2add1ac330b to 50018f67b8c6c214f8156d21c68a504b493e772c
Branch pushed to git repo; I updated commit sha1. New commits:
50018f6 | Fix doctest failure in doc. Add docstring and test to __hash__
|
comment:63 Changed 4 years ago by
- Status changed from needs_work to needs_review
Indeed. These omissions are fixed now.
comment:64 Changed 4 years ago by
- Milestone changed from sage-7.3 to sage-8.0
- Status changed from needs_review to positive_review
Ok, I finally took the time to build the doc and check it out. Build without a fuss and LGTM.
Your long doctest with codes.QuadraticResidueCode(59, GF(3))
takes around 150ms on my machine, I don't know if I would call this long but it's not a problem.
The bot is also green, so let's move forward: positive review :)
comment:65 Changed 4 years ago by
Great! Thanks for the cooperation - I think this has become a really nice addition to Sage. I'm looking forward to seeing your implementation in #23244.
comment:66 follow-up: ↓ 67 Changed 4 years ago by
- Status changed from positive_review to needs_work
sage -t --long src/sage/coding/information_set_decoder.py ********************************************************************** File "src/sage/coding/information_set_decoder.py", line 553, in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate Failed example: A.time_estimate()/5 < avg and avg < A.time_estimate() * 5 # long time Expected: True Got: False ********************************************************************** 1 item had failures: 1 of 15 in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate [170 tests, 1 failure, 5.71 s]
comment:67 in reply to: ↑ 66 Changed 4 years ago by
Replying to vbraun:
sage -t --long src/sage/coding/information_set_decoder.py ********************************************************************** File "src/sage/coding/information_set_decoder.py", line 553, in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate Failed example: A.time_estimate()/5 < avg and avg < A.time_estimate() * 5 # long time Expected: True Got: False ********************************************************************** 1 item had failures: 1 of 15 in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate [170 tests, 1 failure, 5.71 s]
I couldn't get that on my setup while repeating the test a 1000 time.
comment:68 Changed 4 years ago by
Hmm, we were also in doubt about that test, as you can see from the discussion above. I had believed that it was fairly stable now, but evidently timing behaviour on the build bots are not very dependable.
So perhaps we should simply remove the test - what do you think? Any alternative suggestions for how we could test the calibration code?
comment:69 follow-up: ↓ 71 Changed 4 years ago by
- Don't test that the timings have particular values
- Do test that timings can be collected
- Do test that your code makes the right decisions when you specify timings
comment:70 Changed 4 years ago by
- Commit changed from 50018f67b8c6c214f8156d21c68a504b493e772c to eb2efb3d03f135e4f163bf3b2554c2a8d47054ff
Branch pushed to git repo; I updated commit sha1. New commits:
eb2efb3 | Replace brittle calibration test with a test for proper selection
|
comment:71 in reply to: ↑ 69 Changed 4 years ago by
Replying to vbraun:
- Don't test that the timings have particular values
- Do test that timings can be collected
- Do test that your code makes the right decisions when you specify timings
OK, I've pushed a patch removing the offending test and instead making a small refactor such that we can test the last item on your list.
It seems a pity, however, that there is no good way to test that the calibrated timings have anything to do with reality.
comment:72 Changed 4 years ago by
- Status changed from needs_work to needs_review
comment:73 Changed 4 years ago by
- Status changed from needs_review to positive_review
OK, let's do it that way (and the failure in the docbuild is unrelated)
LGTM
comment:74 Changed 4 years ago by
- Branch changed from u/jsrn/information_set_decoding to eb2efb3d03f135e4f163bf3b2554c2a8d47054ff
- Resolution set to fixed
- Status changed from positive_review to closed
I pushed the first version of the code for this ticket.
New commits:
New decoder, InformationSetDecoder
Set this new decoder in other code families and in the catalog
Fixed broken doctests