Opened 6 years ago
Closed 6 years ago
#15443 closed defect (fixed)
Random time outs in ecm.py
Reported by:  vbraun  Owned by:  

Priority:  major  Milestone:  sage6.2 
Component:  number theory  Keywords:  
Cc:  was, robertwb, roed, jpflori  Merged in:  
Authors:  Volker Braun  Reviewers:  François Bissey 
Report Upstream:  N/A  Work issues:  
Branch:  383520e (Commits)  Commit:  383520e4ff6269b391b5e9fd474349b06d9de156 
Dependencies:  Stopgaps: 
Description
Infrequently, I get this doctest failure:
sage t long src/sage/interfaces/ecm.py Timed out ********************************************************************** Tests run before process (pid=8764) timed out: sage: f = ECM() ## line 169 ## sage: n = 508021860739623467191080372196682785441177798407961 ## line 170 ## sage: f.one_curve(n, B1=10000, sigma=11) ## line 171 ## [1, 508021860739623467191080372196682785441177798407961] sage: f.one_curve(n, B1=10000, sigma=1022170541) ## line 173 ## [79792266297612017, 6366805760909027985741435139224233] sage: n = 432132887883903108009802143314445113500016816977037257 ## line 175 ## sage: f.one_curve(n, B1=500000, algorithm="P1") ## line 176 ## [67872792749091946529, 6366805760909027985741435139224233] sage: n = 2088352670731726262548647919416588631875815083 ## line 178 ## sage: f.one_curve(n, B1=2000, algorithm="P+1", x0=5) ## line 179 ## [328006342451, 6366805760909027985741435139224233] sage: sig_on_count() ## line 181 ## 0 sage: f = ECM() ## line 241 ## sage: n = 508021860739623467191080372196682785441177798407961 ## line 242 ## sage: f.find_factor(n) ## line 243 ## [79792266297612017, 6366805760909027985741435139224233] sage: f=2^2^14+1 ## line 247 ## sage: ecm.find_factor(f) ## line 248 ## sage: sig_on_count() ## line 252 ## 0 sage: ecm.factor(602400691612422154516282778947806249229526581) ## line 333 ## [45949729863572179, 13109994191499930367061460439] sage: ecm.factor((2^197 + 1)/3) # takes a long time ## line 336 ##
Attachments (5)
Change History (48)
comment:1 Changed 6 years ago by
comment:2 Changed 6 years ago by
In fact, the problematic doctest is factoring (2^197 + 1)/3
. Usually, this yields from ecm:
Factor found in step 2: 197002597249 Found probable prime factor of 12 digits: 197002597249 Composite cofactor 339872432034468861533158743041639097889948066859 has 48 digits
But randomly (and rather unlikely):
Factor found in step 2: 265748496095531068869578877937 Found composite factor of 30 digits: 265748496095531068869578877937 Probable prime cofactor 251951573867253012259144010843 has 30 digits
So this is the origin of the prime 251951573867253012259144010843, which is then subsequently factored (but apparently very slowly).
comment:3 Changed 6 years ago by
Fun find: run ecm find_factor()
on a prime:
sage: f = ECM() sage: f.find_factor(3)
OMG zombies everywhere!
comment:4 Changed 6 years ago by
Thanks for the analysis! Just a question: do you plan to write a patch (if not, I might).
comment:5 Changed 6 years ago by
In working on it!
comment:6 Changed 6 years ago by
 Branch set to u/vbraun/ecm_cleanup
comment:7 Changed 6 years ago by
 Commit set to 998126dbd22d30eb34b46a0f122f9d62a9f861c9
Branch pushed to git repo; I updated commit sha1. New commits:
998126d  actually fix the source of the random (but unlikely) doctest failures 
comment:8 Changed 6 years ago by
 Cc was robertwb added
 Status changed from new to needs_review
I haven't found any other issue in repeated testing, ready for review!
comment:9 Changed 6 years ago by
So I got 6.0_beta4 built and I tested sage/interfaces/ecm.py a thousand time. The last one (number 1000) failed :(
Running doctests with ID 20131211125950efe5883e. Doctesting 1 file. sage t long local/lib/python2.7/sitepackages/sage/interfaces/ecm.py ********************************************************************** File "local/lib/python2.7/sitepackages/sage/interfaces/ecm.py", line 606, in sage.interfaces.ecm.ECM.get_last_params Failed example: ecm.factor((2^197 + 1)/3) # long time Expected: [197002597249, 1348959352853811313, 251951573867253012259144010843] Got: [197002597249, 339872432034468861533158743041639097889948066859] ********************************************************************** 1 item had failures: 1 of 3 in sage.interfaces.ecm.ECM.get_last_params [43 tests, 1 failure, 6.84 s]  sage t long local/lib/python2.7/sitepackages/sage/interfaces/ecm.py # 1 doctest failed  Total time for all tests: 6.9 seconds cpu time: 0.1 seconds cumulative wall time: 6.8 seconds
How uncool is that.
comment:10 Changed 6 years ago by
You don't happen to have a strace of the failed attempt?
comment:11 Changed 6 years ago by
No haven't thought of that. I am doing another run right now to see if I can reproduce it. But I will do that if I can reproduce it. The other thing that struck me is that the time to run the test is highly variable from 3 to 10 seconds and I have no idea why that would be.
comment:12 Changed 6 years ago by
Nothing on the second run. I have bumped the number to 2000 and I am recording straces in case I see something again.
comment:13 Changed 6 years ago by
Ok I am not sure what kind of fluck happened in my first run. It is still the only failure I have seen in 4000 runs of that test.
comment:14 Changed 6 years ago by
The timing difference is because ecm is random and without a fixed seed. Sometimes it just takes longer. And also depends on which of the three factors is found first.
I've run 20k iterations of ecm.factor((2^197 + 1)/3)
without any failure so far...
comment:15 Changed 6 years ago by
Ok, the 22259th iteration did fail ;) Now trying again with strace...
comment:16 Changed 6 years ago by
I started a 200000 sweep which stopped at 5040 (probably run out of inodes in the folder storing the straces) I have attached the straces for the 3 failures I got.
comment:17 Changed 6 years ago by
I need the strace of the ecm process, not of the main Sage python process... the latter doesn't tell us much.
comment:18 Changed 6 years ago by
OK I thought it was thin. How do I attach to the started ecm process? Any suggestion for doing it in an automated way? strace f (ff)?
comment:19 Changed 6 years ago by
Yes, or see the attached script.
comment:20 Changed 6 years ago by
I think I understand how the script works, one more turn, then.
comment:21 Changed 6 years ago by
OK I got a failure. Any clue to find which strace is the good one?
comment:22 Changed 6 years ago by
Grep for bin/ecm
, for starters...
comment:23 Changed 6 years ago by
Better idea. The failure is
sage t long local/lib/python2.7/sitepackages/sage/interfaces/ecm.py ********************************************************************** File "local/lib/python2.7/sitepackages/sage/interfaces/ecm.py", line 572, in sage.interfaces.ecm.ECM.factor Failed example: ecm.factor(602400691612422154516282778947806249229526581) Expected: [45949729863572179, 13109994191499930367061460439] Got: [602400691612422154516282778947806249229526581]
Grepping for 602400691612422154516282778947806249229526581 does the trick nicely. Two hits one from sage and one from ecm.
comment:24 Changed 6 years ago by
Reconstructed ECM output:
GMPECM 6.4.4 [configured with MPIR 2.6.0, enableasmredc] [ECM] Input number is 602400691612422154516282778947806249229526581 (45 digits) Using B1=2000, B2=147396, polynomial x^1, sigma=4179203867 Step 1 took 2ms Step 2 took 3ms Run 2 out of 1000000000: Using B1=2399, B2=2399186156, polynomial x^1, sigma=4164584203 Step 1 took 2ms Step 2 took 4ms Run 3 out of 1000000000: Using B1=2806, B2=2806224406, polynomial x^1, sigma=3854196504 Step 1 took 3ms Step 2 took 3ms Run 4 out of 1000000000: Using B1=3221, B2=3221294786, polynomial x^1, sigma=3420982637 Step 1 took 3ms Step 2 took 5ms Run 5 out of 1000000000: Using B1=3644, B2=3644294786, polynomial x^1, sigma=1430191020 [...] Using B1=22052, B2=220525026572, polynomial x^1, sigma=1562069045 Step 1 took 20ms Step 2 took 26ms Run 38 out of 1000000000: Using B1=22779, B2=227795026572, polynomial x^1, sigma=1306733316 Step 1 took 21ms Step 2 took 25ms Run 39 out of 1000000000: Using B1=23516, B2=235165031192, polynomial x^1, sigma=2409042976 Step 1 took 22ms Step 2 took 25ms Run 40 out of 1000000000: Using B1=24263, B2=242635031192, polynomial x^1, sigma=518178309 Step 1 took 22ms Step 2 took 26ms ********** Factor found in step 2: 602400691612422154516282778947806249229526581 Found input number N
So it seems that the "Found input number N" output doesn't imply that it is probably prime. Some documentation would have been nice, too. Though that leaves the question: how should we decide to give up? If the number is actually prime then it always ends in "Found input number N", so we can't just repeat until we find a factor. The readme says
5  if no factor is found, either increase D by 5 digits and go to 0, or use another factorization method (MPQS, NFS)
but doesn't give any probability estimate in that case. Also, one of the factors in question is only 17 digits so the B1values tried should have been just fine.
comment:25 Changed 6 years ago by
Ah, OK. I see what happens. The case you are running into really has very low probability and it is surprising you're triggering it at all. You're basically finding BOTH factors together. The idea of ECM is to choose a group E over Z together with a group element G, such that for the integer N=p*q (it works for products of more primes too) such that the order of G mod p is Bsmooth (allowing some nonsmoothness with "second stage" large prime variation). However, if G mod q is also Bsmooth then you're detecting both factors together.
In the analysis of ECM, it turns out the optimal parameter choice is such that the probability of hitting an (E,G) pair with Bsmooth order of G mod p is pretty low. The probability of G mod q also being Bsmooth should be pretty much independent. If p,q are roughly of the same size and you need on average N trials to find G mod p of Bsmooth order, then you expect that in 1/N of those cases, G mod q also has Bsmooth order. If q is much larger than p then this probability should be much lower. I'm surprised that for N=45949729863572179*13109994191499930367061460439 you guys are encountering this at all. It suggests that the B1 value used is a little on the large side if this happens with discernable frequency (what may happen is that you're simply unlucky the first few times and then you have increased B1 to such a point that finding both G mod p and G mod q to be Bsmooth isn't such an unlikely event anymore. So perhaps you're increasing B1 a little too aggressively)
Anyway, as you can see, certainly for p,q of roughly the same size (not ECM's forte, but if p,q are not big, ECM can still be useful for those), finding p*q as a factor is really something that has to be counted on.
You should REALLY NOT run ECM on a number that might be prime. You should FIRST prove the number is composite for certain. In that case, you can treat finding the input as factor as finding no factor at all and simply try another time (perhaps NOT increasing B1)
comment:26 followup: ↓ 27 Changed 6 years ago by
So for a poor physicist like me, how exactly should we go about testing (probabilistically, but with very low probability of failure comparable to the probability in ECM) that a number is not composite. Is that already implemented somewhere in Sage? RabinMiller with some particular list of witnesses?
comment:27 in reply to: ↑ 26 ; followup: ↓ 28 Changed 6 years ago by
Replying to vbraun:
So for a poor physicist like me, how exactly should we go about testing (probabilistically, but with very low probability of failure comparable to the probability in ECM) that a number is not composite. Is that already implemented somewhere in Sage? RabinMiller with some particular list of witnesses?
I'd expect it to be available somewhere, yes, at the very least in PARI/GP. BailliePSW seems to be particularly bulletproof (in the physics sense), with no known pseudoprimes. Finding a counterexample could raise some money for the sage foundation.
Implementation accessible in sage:
sage: pari(N).ispseudoprime()
(so, presently, noone is aware of a particular composite value of N for which this returns true, although I suspect noone would conjecture such values don't exist)
comment:28 in reply to: ↑ 27 Changed 6 years ago by
Replying to nbruin:
(so, presently, noone is aware of a particular composite value of N for which this returns true, although I suspect noone would conjecture such values don't exist)
According to the PARI guys, the standard conjecture is that there exist infinitely many counterexamples, even though no single one is known.
comment:29 Changed 6 years ago by
 Cc roed added
comment:30 Changed 6 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:31 Changed 6 years ago by
 Cc jpflori added
comment:32 Changed 6 years ago by
 Status changed from needs_review to needs_work
Use pari's isprime(n,2) see http://en.wikipedia.org/wiki/Adleman%E2%80%93Pomerance%E2%80%93Rumely_primality_test
comment:33 Changed 6 years ago by
 Commit changed from 998126dbd22d30eb34b46a0f122f9d62a9f861c9 to dbc4fec513ab7ad0c2c139e348f61d0851911920
comment:34 Changed 6 years ago by
 Status changed from needs_work to needs_review
I belive this branch now addresses all comments.
comment:35 Changed 6 years ago by
How do you address #comment:9 ?
comment:36 followup: ↓ 37 Changed 6 years ago by
I'm now first using a strong primality test and only then run ecm forever. So unless you have a counterexample to BailliePSW you are guaranteed to get the prime decomposition.
comment:37 in reply to: ↑ 36 Changed 6 years ago by
Replying to vbraun:
I'm now first using a strong primality test and only then run ecm forever. So unless you have a counterexample to BailliePSW you are guaranteed to get the prime decomposition.
To be clearer, you mean you run the probable primality test on what ECM.factor() returns and only stop if all factors pass? So in the previous failing tests from comment:9, it will continue testing other curves when we were unlucky and ECM returned the composite input number?
Groumpf, or rather if it does not pass probable primality at first, you only stop when the number of returned factors is strictly greater than one? If you went for that, comment:23 will be solved, but not comment:9. So another question, is it still possible to call ECM without that additional probable primality test?
From my point of view, we should just accept the fact that ECM sometimes returns the full composite number (just as it will return factors which are composite), but document it and modify the doctests.
comment:38 followup: ↓ 39 Changed 6 years ago by
Maybe you want to read the factor()
method. First primality test, if that fails run ecm until it finds a factor. Repeat until you get a prime factorization.
There is already a method find_factor()
that just runs the ECM output and may or may not return a factor. But that is a) annoying to doctest and b) not very userfriendly.
comment:39 in reply to: ↑ 38 Changed 6 years ago by
Replying to vbraun:
Maybe you want to read the
factor()
method. First primality test, if that fails run ecm until it finds a factor. Repeat until you get a prime factorization.
Sure, just had no time. It looks good and I 'll try to review that tonight if I catch some time.
There is already a method
find_factor()
that just runs the ECM output and may or may not return a factor. But that is a) annoying to doctest and b) not very userfriendly.
comment:40 Changed 6 years ago by
Now that I have read the patch better (anything is better than reading a patch early in the morning after being woken up by a crying 3 year old) it looks good. I don't really have the time to give it the trashing I gave the previous one right now but I assume you did :)
comment:41 Changed 6 years ago by
 Reviewers set to François Bissey
 Status changed from needs_review to positive_review
comment:42 Changed 6 years ago by
 Commit changed from dbc4fec513ab7ad0c2c139e348f61d0851911920 to 383520e4ff6269b391b5e9fd474349b06d9de156
 Status changed from positive_review to needs_review
Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:
383520e  fix doctest error

comment:43 Changed 6 years ago by
 Branch changed from u/vbraun/ecm_cleanup to 383520e4ff6269b391b5e9fd474349b06d9de156
 Resolution set to fixed
 Status changed from needs_review to closed
ecm uses different seeds all the time, for example
produces a different output every time.
Occasionally, ecm.py does a lot more factorizations and then eventually times out. Using strace, I see that, for example, the following is run when it eventually times out;
Note that the command is rather slow (requires hundreds of trial steps) and that this prime (251951573867253012259144010843) is usually not used as an ecm input. So occasionally (but unlikely) one of the previous computations must yield this number as candidate and then factoring it takes a really long time.