Opened 3 years ago
Last modified 3 years ago
#21339 new enhancement
TestSet Decoder for LinearCodes
Reported by:  imarquez  Owned by:  

Priority:  major  Milestone:  sage7.4 
Component:  coding theory  Keywords:  sd75 
Cc:  jsrn, dlucas, danielaugot  Merged in:  
Authors:  Irene MárquezCorbella, Miguel Marco  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  u/imarquez/groebnerdecoder (Commits)  Commit:  76b83d7663419b1c0fdb4aec79e194b0767f3bea 
Dependencies:  Stopgaps: 
Description
A new version of the ticket #14973 adapted to the new coding Theory framework
Change History (22)
comment:1 Changed 3 years ago by
 Branch set to u/imarquez/groebnerdecoder
comment:2 Changed 3 years ago by
 Commit set to 0fa589622f71eb2a82f15a5e7eae7f8df6639cd1
comment:3 Changed 3 years ago by
 Branch changed from u/imarquez/groebnerdecoder to u/danielaugot/groebnerdecoder
comment:4 Changed 3 years ago by
 Commit changed from 0fa589622f71eb2a82f15a5e7eae7f8df6639cd1 to de2703370e383a1a743e8b33a9163e036b4fa5c6
Hi Irene,
there was a very minor conflict (traling whitespace) in merging with develop.
Daniel
New commits:
de27033  merge solved, by removing a trailing whitespace

comment:5 Changed 3 years ago by
 Branch changed from u/danielaugot/groebnerdecoder to u/imarquez/groebnerdecoder
comment:6 Changed 3 years ago by
 Commit changed from de2703370e383a1a743e8b33a9163e036b4fa5c6 to 8f4bb7a41f05a3c8f82ff0805db93f5acf3ee11c
Branch pushed to git repo; I updated commit sha1. New commits:
8f4bb7a  Eliminate the rest of the conflict marks

comment:7 Changed 3 years ago by
I will continue working in the documentation and extra functions in the following days.
comment:8 Changed 3 years ago by
 Branch changed from u/imarquez/groebnerdecoder to u/mmarco/groebnerdecoder
comment:9 Changed 3 years ago by
 Branch changed from u/mmarco/groebnerdecoder to u/imarquez/groebnerdecoder
comment:10 Changed 3 years ago by
 Commit changed from 8f4bb7a41f05a3c8f82ff0805db93f5acf3ee11c to 76b83d7663419b1c0fdb4aec79e194b0767f3bea
Branch pushed to git repo; I updated commit sha1. New commits:
76b83d7  Documentation correct

comment:11 Changed 3 years ago by
New functions have been added: coset_leaders, coset_weight_distribution, covering_radius(without Magma), newton_radius and unique_decoding_radius().
Documentation now is correct
comment:12 Changed 3 years ago by
Sorry before is covering_radius (without the optional package of GUAVA)
comment:13 Changed 3 years ago by
Hello Irene,
I actually implemented a version of covering_radius
which does not use GUAVA in
#19913. It automatically detects if GUAVA is installed, and if it's not, it uses a
very slow Python algorithm instead of using GUAVA.
We should benchmark your implementation and mine, and keep the fastest one. BTW, I did some benchmarking GUAVA vs. naive Python implementation, and GUAVA was always faster (which is why I automatically use GUAVA calls if the user has GUAVA installed).
David
comment:14 followups: ↓ 15 ↓ 17 Changed 3 years ago by
In the _insert_next_fq method, you treat these two different cases:
if List: for i in s: new = v.__copy__() for q in Fqstar: new[i]=q if new not in List: List.append(new.__copy__()) else: for i in s: new = v.__copy__() for q in Fqstar: new[i]=q List.append(new.__copy__())
is it really what you want to do? wouldn't it make sense to just use the first case everywhere?
comment:15 in reply to: ↑ 14 Changed 3 years ago by
That whole helpermethod can be simplified into an almostoneliner: all errorvectors of weight 1 can be created as:
I = identity_matrix(F, n) single_errs = ( a * I.row(i) for a in F for i in range(n) )
Now you want to extend List
with the elements [ v + e for e in single_errs ]
.
But you want to avoid duplicating the elements already in List
. Therefore you should use a set
for List
. This causes a small snag since set
s only work for immutable vectors.
Assuming single_errs
was computed outside at the beginning of the method, then the update of List
in the loop of coset_leaders
could look like
shifts = [ v + e for e in single_errs ] for w in shifts: w.set_immutable() List.union(shifts)
List
breaks the convention of starting variables with minuscule. What about the name worklist
?
Best, Johan
comment:16 Changed 3 years ago by
The keyword algorithm
for the "one"/"all" of coset_leaders
is not good: algorithm
is reserved for different methods with which to compute the result. What about representative = True
, where representative = False
means to output all of them.
comment:17 in reply to: ↑ 14 ; followup: ↓ 18 Changed 3 years ago by
Replying to mmarco:
In the _insert_next_fq method, you treat these two different cases: is it really what you want to do? wouldn't it make sense to just use the first case everywhere?
I think the if
condition should be kept. It makes a difference and avoids a linear search inside a for
loop in the case List
is empty.
This causes a small snag since sets only work for immutable vectors.
I think we should not return immutable vectors from coset leaders. The output might be used for further computations by the user and then it will create strange issues, which will be hard to debug. The less surprises we have for the user, the better it is.
List
breaks the convention of starting variables with minuscule. What about the nameworklist
?
This can be done I suppose, just for consistency.
The keyword algorithm for the "one"/"all" of coset_leaders is not good
I agree. This is not really a different algorithm.
comment:18 in reply to: ↑ 17 Changed 3 years ago by
If you care about the coset_leaders
method I see many improvements possibles:
 use
deque
instead oflist
for the worklist;  append generators instead of real lists;
 take care to enumerate your candidates uniquely;
 use a hashed container for the syndromes already seen;
 early abort when you have seen all the possible syndromes;
 normalize syndromes (e.g. only keep those starting with
1
).
Just as a quick model you might want to look at this:
import collections def coset_leaders_one(C): H = C.parity_check_matrix().transpose() K = C.base_ring() q = K.cardinality() def my_generator(v): t = list(v) for i, vi in enumerate(v): if vi: break for a in K: if a: t[i] = a yield vector(t) t[i] = 0 def normalize(s): for si in s: if si: return tuple(s / si) return tuple(s) z = vector(K, C.length()) sz = normalize(C.syndrome(z)) S = {sz: z} generators = collections.deque([my_generator(z)]) target_size = (q ** H.ncols()  1) // (q  1) + 1 while generators: g = generators.popleft() for v in g: syndrome = normalize(v*H) if syndrome not in S: S[syndrome] = v if len(S) == target_size: r = [S.pop(sz)] for s in S.values(): r.extend(a * s for a in K if a) return r if not v[0]: generators.append(my_generator(v))
You can adapt this for the case where you want them all.
comment:19 Changed 3 years ago by
You might also look at this implementation: u/ylchapuy/coset_leaders
comment:20 followup: ↓ 21 Changed 3 years ago by
@ylchapuy Thank you for this implementation. If you have this implementation in a Ticket somewhere please try to get it in first, and make this ticket a dependency of the other one. Your implementation is several hundred times faster than this one.
Meanwhile, I have been checking whether the coset leaders returned in your function "match" with the coset leaders returned by the implementation in this ticket. I haven't looked into it in detail, but there is a discrepancy. The discrepancy is not present for all codes though.
sage: from coset_1 import coset_leaders_one sage: from coset_2 import coset_leaders sage: C = codes.ReedSolomonCode(7,5,GF(8,'a')) /mnt/usb/Installations/sage/src/bin/sageipython:1: DeprecationWarning: codes.Re edSolomonCode is now deprecated. Please use codes.GeneralizedReedSolomonCode ins tead. See http://trac.sagemath.org/18928 for details. #!/usr/bin/env python sage: L1 = coset_leaders_one(C) sage: L2 = coset_leaders(C, "one") sage: L2_2 = flatten(L2) sage: L1bool = [False]*len(L1) sage: L2bool = [False]*len(L2) sage: Fqstar = C.base_ring().list()[1:] sage: for i,w in enumerate(L1): ....: wlist = [_*w for _ in Fqstar] ....: for _w in wlist: ....: if _w in L2_2: ....: L1bool[i] = True ....: break ....: for i,v in enumerate(L2_2): ....: vlist = [_*v for _ in Fqstar] ....: for _v in vlist: ....: if _v in L1: ....: L2bool[i] = True ....: break ....: sage: all(L1bool) True sage: all(L2bool) False sage: %time L1 = coset_leaders_one(C) CPU times: user 6.67 ms, sys: 0 ns, total: 6.67 ms Wall time: 6.02 ms sage: %time L2 = coset_leaders(C) CPU times: user 8.43 s, sys: 3.33 ms, total: 8.44 s Wall time: 8.39 s sage: len(L1) 64 sage: len(L2) 64 sage: len(L2_2) 64
comment:21 in reply to: ↑ 20 Changed 3 years ago by
Replying to ppurka:
@ylchapuy Thank you for this implementation. If you have this implementation in a Ticket somewhere please try to get it in first, and make this ticket a dependency of the other one. Your implementation is several hundred times faster than this one.
The ticket where this should be done is #19913 .
Sadly I won't have time to do more on this in the foreseeable future, but would be pleased if someone else is motivated enough to integrate this on his own.
comment:22 Changed 3 years ago by
And as a side remark, the code in the branch from comment:19 is even much better.
6 time faster on your example code:
sage: C [7, 5, 3] Generalized ReedSolomon Code over Finite Field in a of size 2^3 sage: %timeit coset_leaders_one(C) 100 loops, best of 3: 3.03 ms per loop sage: %timeit _coset_leaders(C) 1000 loops, best of 3: 477 µs per loop
and on a some bigger examples
sage: C = codes.random_linear_code(GF(7), 32, 27) sage: %timeit coset_leaders_one(C) 1 loop, best of 3: 1.35 s per loop sage: %timeit _coset_leaders(C) 10 loops, best of 3: 78.2 ms per loop sage: C = codes.random_linear_code(GF(3), 32, 22) sage: %timeit coset_leaders_one(C) 1 loop, best of 3: 23.8 s per loop sage: %timeit _coset_leaders(C) 1 loop, best of 3: 1.18 s per loop
I didn't try with the implementation from here...
This is a ticket still under work (documentation still needs some work)
New commits:
First Implementation of GroebnerBasis
First Implementation of TestSetDecoder