Opened 2 months ago
Closed 2 months ago
#33961 closed enhancement (fixed)
compute square roots modulo powers of two in polynomial time
Reported by:  lorenz  Owned by:  

Priority:  major  Milestone:  sage9.7 
Component:  algebra  Keywords:  
Cc:  Merged in:  
Authors:  Lorenz Panny  Reviewers:  Travis Scrimshaw 
Report Upstream:  N/A  Work issues:  
Branch:  aa06a21 (Commits, GitHub, GitLab)  Commit:  aa06a211bd923aeb7ebc51b7de088a1e224646b4 
Dependencies:  Stopgaps: 
Description
Currently, square_root_mod_prime_power()
in integer_mod.pyx
contains the following:
if p == 2: if e == 1: return a # TODO: implement something that isn't totally idiotic. for x in a.parent(): if x**2 == a: return x
In this patch, we do so.
Change History (9)
comment:1 Changed 2 months ago by
 Commit changed from edf55782390bcb960a054f623788aec766fde19f to 1d1551e9bdfc1fe816e3915dad1abec21864d116
comment:2 Changed 2 months ago by
 Status changed from new to needs_review
comment:3 Changed 2 months ago by
 Commit changed from 1d1551e9bdfc1fe816e3915dad1abec21864d116 to 0b6382da3b8714b10c48f1ce55ba2ff0f55e7623
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
0b6382d  more efficient algorithm for square roots modulo 2^n

comment:4 Changed 2 months ago by
 Summary changed from compute square roots in modulo powers of two in polynomial time to compute square roots modulo powers of two in polynomial time
comment:5 Changed 2 months ago by
How does this compare against the old algorithm for small values? I am wondering if we want to impose a cutoff and do the dumb way when n
is small. (Cf. bubble sort versus more efficient search algorithms for small lists.)
comment:6 Changed 2 months ago by
 Commit changed from 0b6382da3b8714b10c48f1ce55ba2ff0f55e7623 to aa06a211bd923aeb7ebc51b7de088a1e224646b4
Branch pushed to git repo; I updated commit sha1. New commits:
aa06a21  faster 2adic lifting for square roots modulo 2^n

comment:7 Changed 2 months ago by
I revisited the choice of algorithm and it's both simpler and even faster now. :)
Example:
sage: for e in range(1,19): ....: set_random_seed(0) ....: tests = [Zmod(2^e).random_element()^2 for _ in range(1000)] ....: print(f'{e:<2}', end='  ', flush=True) ....: %timeit for y in tests: y.sqrt()
Old:
1  95 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 2  100 µs ± 1.78 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 3  103 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 4  113 µs ± 3.67 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) 5  122 µs ± 369 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 6  145 µs ± 948 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 7  332 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 8  426 µs ± 5.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 9  669 µs ± 7.18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 10  1.03 ms ± 3.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 11  1.78 ms ± 7.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 12  3.26 ms ± 24.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 13  6.13 ms ± 27.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 14  486 ms ± 1.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 15  1.01 s ± 3.67 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 16  1.9 s ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 17  4.11 s ± 79.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 18  7.66 s ± 51.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
New:
1  89.5 µs ± 397 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 2  90 µs ± 177 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 3  95.2 µs ± 505 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 4  103 µs ± 193 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 5  120 µs ± 372 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 6  141 µs ± 526 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 7  338 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 8  428 µs ± 3.18 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 9  696 µs ± 9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 10  1.03 ms ± 2.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 11  1.75 ms ± 2.83 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 12  3.2 ms ± 18.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 13  5.98 ms ± 20.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 14  17.9 ms ± 93.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 15  18.4 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 16  18.1 ms ± 77.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 17  18.4 ms ± 62.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 18  19 ms ± 42.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The reason for the jump at n ≥ 14
is that n ≤ 13
is implemented by the IntegerMod_int
specialization, which has its own version of .sqrt()
using brute force. Thus, the new code is only used for n ≥ 14
anyway, where it clearly outperforms the old (visibly exponentialtime) version.
comment:8 Changed 2 months ago by
 Reviewers set to Travis Scrimshaw
 Status changed from needs_review to positive_review
Thank you. LGTM.
comment:9 Changed 2 months ago by
 Branch changed from public/square_roots_modulo_powers_of_2 to aa06a211bd923aeb7ebc51b7de088a1e224646b4
 Resolution set to fixed
 Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
more efficient algorithm for square roots modulo 2^n