Opened 7 years ago
Closed 7 years ago
#18607 closed enhancement (fixed)
Speed-up for __contains__ in linear codes
Reported by: | dlucas | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.8 |
Component: | coding theory | Keywords: | |
Cc: | jsrn | Merged in: | |
Authors: | David Lucas | Reviewers: | Johan Sebastian Rosenkilde Nielsen |
Report Upstream: | N/A | Work issues: | |
Branch: | 6319903 (Commits, GitHub, GitLab) | Commit: | 6319903d843188bfa2dfdc1083a3b9fc25ec6a8c |
Dependencies: | Stopgaps: |
Description (last modified by )
The actual implementation of
__contains__
for linear codes is quite slow. It can be improved using the syndrome computation instead of checking if the vector belongs to a specific subspace of the ambient space.
Test:
F = GF(1009) n, k = 1000, 500 C = codes.RandomLinearCode(n, k, F) subspace = [] syndrome = [] for i in range(20): c = C.random_element() start = time.clock() A = C.ambient_space() S = A.subspace(C.gens()) res = S.__contains__(c) elapsed = (time.clock() - start) assert res == True subspace.append(elapsed) start = time.clock() if not c in C.ambient_space() or len(c) != C.length(): res = False else: res = (C.syndrome(c) == 0) elapsed = (time.clock() - start) assert res == True syndrome.append(elapsed)
Results:
sage: median(subspace) 1.526604500000019 sage: median(syndrome) 0.00408399999997755
Change History (9)
comment:1 Changed 7 years ago by
- Branch set to u/dlucas/speedup_in_contains
comment:2 Changed 7 years ago by
- Commit set to 6319903d843188bfa2dfdc1083a3b9fc25ec6a8c
- Status changed from new to needs_review
comment:3 Changed 7 years ago by
An obvious alternative is to use the existing implementation but cache the space S
for later use. Let's call that solution "sug" (for suggestion), and yours "new".
Now, your tests are not very convincing: you are testing only one set of parameters over one field. And furthermore, you are testing only for the speed of contains
on codeword - not on non-codewords.
I quickly wrote a more comprehensive set of tests. Here are some timings (element=True
means that we are testing on codewords, otherwise we test on random elements of the ambient space):
Results for N=3 and C=Linear code of length 1200, dimension 300 over Finite Field of size 2 and elements=True new: [0.26615000000003874, 0.003176999999993768, 0.0032519999999749416] sug: [0.16259099999996351, 2.3999999996249244e-05, 2.2999999998774e-05] Results for N=3 and C=Linear code of length 1200, dimension 300 over Finite Field of size 2 and elements=False new: [0.26317399999999225, 0.0001400000000444379, 0.00011000000000649379] sug: [0.5222530000000347, 0.0015950000000088949, 0.0015670000000227446] Results for N=3 and C=Linear code of length 12, dimension 3 over Finite Field of size 2 and elements=True new: [0.001967999999976655, 0.0005400000000008731, 0.0005019999999831271] sug: [0.0005810000000110449, 1.3999999964653398e-05, 1.8000000011397788e-05] Results for N=3 and C=Linear code of length 12, dimension 3 over Finite Field of size 2 and elements=False new: [0.0007109999999670435, 5.1000000041767635e-05, 5.9000000021569576e-05] sug: [0.0015510000000062973, 0.0002230000000054133, 0.00023799999996754195] Results for N=5 and C=Linear code of length 1000, dimension 5 over Finite Field of size 1009 and elements=True new: [1.7585720000000151, 0.0033129999999914617, 0.0033280000000104337, 0.0032279999999786924, 0.0033119999999939864] sug: [0.007593999999983225, 6.600000000389628e-05, 4.999999998744897e-05, 5.7999999967250915e-05, 6.799999999884676e-05] Results for N=5 and C=Linear code of length 1000, dimension 5 over Finite Field of size 1009 and elements=False new: [1.5777009999999905, 0.002039000000024771, 0.0020959999999945467, 0.0020149999999716783, 0.0018599999999651118] sug: [0.06748300000003837, 0.0026629999999840948, 0.0029540000000451982, 0.0026410000000396394, 0.002763000000015836] Results for N=5 and C=Linear code of length 1000, dimension 950 over Finite Field of size 1009 and elements=True new: [0.2903909999999996, 0.11106599999999389, 0.1123590000000263, 0.11180000000001655, 0.11880000000002156] sug: [2.45943699999998, 0.0011589999999728207, 0.0011220000000093933, 0.0009630000000129257, 0.0010689999999726751] Results for N=5 and C=Linear code of length 1000, dimension 950 over Finite Field of size 1009 and elements=False new: [0.18305399999997007, 0.00016400000004068715, 0.00014800000002423985, 0.00017199999996364568, 0.00015700000000151704] sug: [4.64434399999999, 0.010827000000006137, 0.01069799999999077, 0.010796000000027561, 0.010704999999973097] Results for N=5 and C=Linear code of length 300, dimension 150 over Finite Field in a of size 2^8 and elements=True new: [0.1043430000000285, 0.025561999999979435, 0.025893999999993866, 0.025636000000019976, 0.02579400000001897] sug: [0.04122899999998708, 3.000000003794412e-05, 3.100000003541936e-05, 3.1999999976051186e-05, 3.500000002532033e-05] Results for N=5 and C=Linear code of length 300, dimension 150 over Finite Field in a of size 2^8 and elements=False new: [0.07121999999998252, 0.002735000000029686, 0.0026799999999980173, 0.0026340000000004693, 0.002746999999999389] sug: [0.11262400000003936, 0.0019409999999879801, 0.0019139999999993051, 0.0019269999999664833, 0.0019049999999651845] Results for N=5 and C=Linear code of length 300, dimension 280 over Finite Field in a of size 2^8 and elements=True new: [0.056963000000052944, 0.040061000000036984, 0.04052200000000994, 0.039447999999993044, 0.04015300000003208] sug: [0.061540999999976975, 4.0000000012696546e-05, 3.90000000152213e-05, 4.199999995080361e-05, 3.69999999634274e-05] Results for N=5 and C=Linear code of length 300, dimension 280 over Finite Field in a of size 2^8 and elements=False new: [0.01576399999999012, 0.0006789999999909924, 0.0006390000000351392, 0.0006480000000124164, 0.0007170000000087384] sug: [0.19535200000001396, 0.002287000000023909, 0.0021719999999731954, 0.0021679999999832944, 0.0024019999999609354]
As can be seen, "sug" does much better than "new" when we are testing codewords, except that the first call is sometimes exorbitantly expensive (for high-rate codes). On non-codewords "new" seems to beat "sug" more or less always, but less so on low-rate codes.
comment:4 Changed 7 years ago by
All in all, I think I vote for "new", i.e. the current ticket's solution. The use case for codes would be to often check codeword membership that fails. Also, I'm quite concerned about the exorbitant price for the first call that "sug" has.
comment:5 Changed 7 years ago by
It think it's better to write self.syndrome(v).is_zero
than do the explicit comparison with 0
.
comment:6 Changed 7 years ago by
- Description modified (diff)
- Status changed from needs_review to positive_review
So if the author has not changed his mind as a consequence of my comments, I give this the green light. All tests pass and documentation builds.
(I fixed the bug in the test code)
comment:7 Changed 7 years ago by
- Status changed from positive_review to needs_work
Reviewer name missing
comment:8 Changed 7 years ago by
- Reviewers set to Johan Sebastian Rosenkilde Nielsen
- Status changed from needs_work to positive_review
comment:9 Changed 7 years ago by
- Branch changed from u/dlucas/speedup_in_contains to 6319903d843188bfa2dfdc1083a3b9fc25ec6a8c
- Resolution set to fixed
- Status changed from positive_review to closed
New commits:
Changed code of __contains__