Opened 2 months ago
Closed 2 months ago
#33961 closed enhancement (fixed)
compute square roots modulo powers of two in polynomial time
Reported by: | lorenz | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-9.7 |
Component: | algebra | Keywords: | |
Cc: | Merged in: | ||
Authors: | Lorenz Panny | Reviewers: | Travis Scrimshaw |
Report Upstream: | N/A | Work issues: | |
Branch: | aa06a21 (Commits, GitHub, GitLab) | Commit: | aa06a211bd923aeb7ebc51b7de088a1e224646b4 |
Dependencies: | Stopgaps: |
Description
Currently, square_root_mod_prime_power()
in integer_mod.pyx
contains the following:
if p == 2: if e == 1: return a # TODO: implement something that isn't totally idiotic. for x in a.parent(): if x**2 == a: return x
In this patch, we do so.
Change History (9)
comment:1 Changed 2 months ago by
- Commit changed from edf55782390bcb960a054f623788aec766fde19f to 1d1551e9bdfc1fe816e3915dad1abec21864d116
comment:2 Changed 2 months ago by
- Status changed from new to needs_review
comment:3 Changed 2 months ago by
- Commit changed from 1d1551e9bdfc1fe816e3915dad1abec21864d116 to 0b6382da3b8714b10c48f1ce55ba2ff0f55e7623
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
0b6382d | more efficient algorithm for square roots modulo 2^n
|
comment:4 Changed 2 months ago by
- Summary changed from compute square roots in modulo powers of two in polynomial time to compute square roots modulo powers of two in polynomial time
comment:5 Changed 2 months ago by
How does this compare against the old algorithm for small values? I am wondering if we want to impose a cutoff and do the dumb way when n
is small. (Cf. bubble sort versus more efficient search algorithms for small lists.)
comment:6 Changed 2 months ago by
- Commit changed from 0b6382da3b8714b10c48f1ce55ba2ff0f55e7623 to aa06a211bd923aeb7ebc51b7de088a1e224646b4
Branch pushed to git repo; I updated commit sha1. New commits:
aa06a21 | faster 2-adic lifting for square roots modulo 2^n
|
comment:7 Changed 2 months ago by
I revisited the choice of algorithm and it's both simpler and even faster now. :-)
Example:
sage: for e in range(1,19): ....: set_random_seed(0) ....: tests = [Zmod(2^e).random_element()^2 for _ in range(1000)] ....: print(f'{e:<2}', end=' | ', flush=True) ....: %timeit for y in tests: y.sqrt()
Old:
1 | 95 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 2 | 100 µs ± 1.78 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 3 | 103 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 4 | 113 µs ± 3.67 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 5 | 122 µs ± 369 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 6 | 145 µs ± 948 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 7 | 332 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 8 | 426 µs ± 5.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 9 | 669 µs ± 7.18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 10 | 1.03 ms ± 3.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 11 | 1.78 ms ± 7.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 12 | 3.26 ms ± 24.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 13 | 6.13 ms ± 27.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 14 | 486 ms ± 1.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 15 | 1.01 s ± 3.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 16 | 1.9 s ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 17 | 4.11 s ± 79.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 18 | 7.66 s ± 51.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
New:
1 | 89.5 µs ± 397 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 2 | 90 µs ± 177 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 3 | 95.2 µs ± 505 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 4 | 103 µs ± 193 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 5 | 120 µs ± 372 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 6 | 141 µs ± 526 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 7 | 338 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 8 | 428 µs ± 3.18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 9 | 696 µs ± 9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 10 | 1.03 ms ± 2.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 11 | 1.75 ms ± 2.83 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 12 | 3.2 ms ± 18.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 13 | 5.98 ms ± 20.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 14 | 17.9 ms ± 93.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 15 | 18.4 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 16 | 18.1 ms ± 77.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 17 | 18.4 ms ± 62.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 18 | 19 ms ± 42.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The reason for the jump at n ≥ 14
is that n ≤ 13
is implemented by the IntegerMod_int
specialization, which has its own version of .sqrt()
using brute force. Thus, the new code is only used for n ≥ 14
anyway, where it clearly outperforms the old (visibly exponential-time) version.
comment:8 Changed 2 months ago by
- Reviewers set to Travis Scrimshaw
- Status changed from needs_review to positive_review
Thank you. LGTM.
comment:9 Changed 2 months ago by
- Branch changed from public/square_roots_modulo_powers_of_2 to aa06a211bd923aeb7ebc51b7de088a1e224646b4
- Resolution set to fixed
- Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
more efficient algorithm for square roots modulo 2^n