Opened 11 years ago
Closed 9 years ago
#11475 closed enhancement (fixed)
improve prime_pi (speedup + small fixes)
Reported by:  R. Andrew Ohana  Owned by:  William Stein 

Priority:  major  Milestone:  sage6.3 
Component:  number theory  Keywords:  primes, prime counting, prime_pi sd40.5 
Cc:  William Stein, Leif Leonhardy, Kevin Stueve, Victor S. Miller, KarlDieter Crisman  Merged in:  
Authors:  R. Andrew Ohana  Reviewers:  Yann LaigleChapuy, Leif Leonhardy, William Stein, KarlDieter Crisman 
Report Upstream:  N/A  Work issues:  
Branch:  e3e33e6 (Commits, GitHub, GitLab)  Commit:  e3e33e62cc19bf93b5236e7d1177482bd52b8a9e 
Dependencies:  #15138  Stopgaps: 
Description (last modified by )
I have rewritten the prime_pi
method from scratch, it is much cleaner and less hacky this time (as I have learned more about coding), however it is still based on the same algorithm as before (no LMO yet, although I intend to take a stab at it later this year). This was developed in parallel with a method to count primes in residue classes (ticket forthcoming).
The overhead for small input >= 2
is rather large in the current prime_pi
, this has been dramatically improved:
New:
sage: timeit('prime_pi(100)') 625 loops, best of 3: 475 ns per loop sage: timeit('prime_pi(10000)') 625 loops, best of 3: 483 ns per loop
Old:
sage: timeit('prime_pi(100)') 625 loops, best of 3: 1.71 µs per loop sage: timeit('prime_pi(10000)') 625 loops, best of 3: 5.56 µs per loop
There is also a ~5x speedup (conservatively) for larger input:
New:
sage: timeit('prime_pi(10**8)') 625 loops, best of 3: 530 µs per loop sage: timeit('prime_pi(10**10)') 5 loops, best of 3: 27.1 ms per loop sage: time prime_pi(10**12) 37607912018 Time: CPU 1.53 s, Wall: 1.53 s
Old:
sage: timeit('prime_pi(10**8)') 125 loops, best of 3: 3.66 ms per loop sage: timeit('prime_pi(10**10)') 5 loops, best of 3: 161 ms per loop sage: time prime_pi(10**12) 37607912018 Time: CPU 8.65 s, Wall: 8.64 s
The user can also give a second argument to use additional memory for a potential speedup. All primes less than the second argument are used in computing pi(x). For example:
sage: time prime_pi(10**12, 10**8) 37607912018 Time: CPU 0.92 s, Wall: 0.92 s sage: time prime_pi(10**12, 10**8) # pari's prime table is now cached 37607912018 Time: CPU 0.58 s, Wall: 0.58 s
An earlier version of this implementation started to fail to give correct output on 64bit systems somewhere between 9.5*10**15
and 9.75*10**15
, for issues that I had previously been unable to determine, so I had initially limited input to be < 2**49
, which I am fairly confident is safe, as I have done a fair bit of testing in this range trying to debug this issue (despite it taking hours per call). I would appreciate testing on 32bit platforms.
A lot of the speed of this algorithm relies on the __cached_count
method, which is essentially a binary search (with a simple adjustment in the beginning that takes advantage of the density of the primes). This is the fastest type of binary search I know of, but if anyone knows of a way to speed it up, it could have a significant effect.
The other big contributor to the speed is the __smallPi
table, which stores the values of pi(x) for x < 2**16
. This is used in the same manner as __cached_count
, but is of course much faster.
Finally, I have introduced the legendres_formula
method, which takes two arguments, x
and a
, and returns the number of positive integers not larger than x
that are not divisible by any  i.e. coprime to each  of the first a
primes. This function is core to all combinatorial methods for computing the prime counting function, so I figured that I would make this publicly accessible now that I have clean code.
Apply trac_11475mk2.patch and trac_11475reviewer.patch to the Sage library.
Attachments (12)
Change History (95)
Changed 11 years ago by
Attachment:  trac_11475.patch added 

comment:1 Changed 11 years ago by
Description:  modified (diff) 

Status:  new → needs_review 
comment:2 followup: 5 Changed 11 years ago by
The link(s) to your paper is (currently) dead. (I wouldn't call it 'main' btw. ;) )
Haven't yet looked a the rest...
comment:3 Changed 11 years ago by
Authors:  ohanar → R. Andrew Ohana 

comment:4 followup: 6 Changed 11 years ago by
A lot of the speed of this algorithm relies on the cached_count method, which is essentially a binary search (with a simple adjustment in the beginning that takes advantage of the density of the primes)
Something like this should give you a 10% speedup for big inputs.
if p < 200: pos = p>>2 size = 5 else: size = self.__numPrimes>>1 # Use the expected density of primes for expected inputs to make an # educated guess if p>>3 < size: size = p>>3 pos = size
(beware to make sure there are enough primes cached for small inputs)
comment:5 Changed 11 years ago by
comment:6 Changed 11 years ago by
Replying to leif:
The link(s) to your paper is (currently) dead. (I wouldn't call it 'main' btw. ;) )
Sorry about that, I was planning on moving the folder, but I forgot to. I'll fix that tonight.
Replying to ylchapuy:
Something like this should give you a 10% speedup for big inputs.
Thanks, I'll try it out tonight, and update the description and patch if it proves fruitful.
comment:7 Changed 11 years ago by
Description:  modified (diff) 

Changed 11 years ago by
Attachment:  trac_11475v2.patch added 

Makes use of Yann's suggested improvement.
comment:8 followup: 9 Changed 11 years ago by
Andrew, do you have some examples of wrong prime counts (beyond 2^{49}), and if so, on what kind of systems does one experience them?
While the old implementation was said to fail on (at least some) 32bit systems for large inputs, the new one in contrast fails on 64bit systems?
(More or less just curious; perhaps I'll one day^{TM} try to track this down, depending on the affected system(s) as well.)
comment:9 Changed 11 years ago by
Replying to leif:
Andrew, do you have some examples of wrong prime counts (beyond 2^{49}), and if so, on what kind of systems does one experience them?
I did most of my testing on mod.math and sage.math, which is ubuntu linux (64bit). I don't have failing results any longer, but I can tell you that I discovered the issue with 10**16
, and I also know that it fails for 9.75*10**15
. Give me 78 hours and I'll get you the output for those inputs.
While the old implementation was said to fail on (at least some) 32bit systems for large inputs, the new one in contrast fails on 64bit systems?
I don't know how the new one behaves on 32bit systems, since I have done literally no testing. I hope it does better than the old one, since I used stdint this time around (and the only real difference between my 64bit vs 32bit implementation is the use of pointers in the later).
(More or less just curious; perhaps I'll one day^{TM} try to track this down, depending on the affected system(s) as well.)
comment:11 Changed 11 years ago by
Description:  modified (diff) 

comment:12 Changed 11 years ago by
Description:  modified (diff) 

Changed 11 years ago by
Attachment:  trac_11475v3.patch added 

Fixes doctests, and changed cached_primes again to improve speed
comment:14 Changed 11 years ago by
Description:  modified (diff) 

comment:15 Changed 11 years ago by
Cc:  KarlDieter Crisman added 

comment:16 followup: 17 Changed 11 years ago by
Replying to leif:
Andrew, do you have some examples of wrong prime counts (beyond 2^{49}), and if so, on what kind of systems does one experience them?
sage: time prime_pi(9.75*10**15) 272450165623660 Time: CPU 25614.37 s, Wall: 25614.47 s
The correct value is 272450167482953
(http://www.ieeta.pt/~tos/primes.html). I also computed prime_pi(10**16)
but in a moment of confusion, I killed the process before I grabbed the value. If you want it I can start running it again with my current version of the code (~35% faster than the latest patch).
comment:17 Changed 11 years ago by
Replying to ohanar:
Replying to leif:
Andrew, do you have some examples of wrong prime counts (beyond 2^{49}), and if so, on what kind of systems does one experience them?
sage: time prime_pi(9.75*10**15) 272450165623660 Time: CPU 25614.37 s, Wall: 25614.47 s
The correct value is
272450167482953
(http://www.ieeta.pt/~tos/primes.html). I also computedprime_pi(10**16)
but in a moment of confusion, I killed the process before I grabbed the value. If you want it I can start running it again with my current version of the code (~35% faster than the latest patch).
Never mind. I thought you had collected a bunch of false results during development. If (apparently) all platforms are affected, it should be easier to track this down  rather than flaws due to e.g. Apple "specifics"... ;)
Btw, I have a large collection of correct prime counts here, so if eventually the sysloads drop, I'll by myself check some results against your new implementation.
Anyway, anybody feel free to compute more values above our current limit and post them here, whether correct or not. :)
Changed 11 years ago by
Attachment:  trac_11475v4.patch added 

Some major speed improvements using ~4kB of tables
comment:18 Changed 11 years ago by
Description:  modified (diff) 

comment:19 Changed 11 years ago by
An update:
v4: added a table of prime_pi
values that can be stored in a byte  using this with the __cached_primes
method gave a giant speedup. Since the sum at the beginning of __phi
(and __phi32
) differs from 16*(x/77)
by a small factor that only depends on x%2310
, so adding a table with these factors results in another large speedup. These tables take a total of ~4kB.
v5: the tables introduced a significant initialization time for prime_pi
, and William asked me to fix this, so prime_pi
(and legendres_formula) only initialize these tables on an initial call  so initialization time is now ~650ns:
sage: timeit('sage.functions.prime_pi.PrimePi()') 625 loops, best of 3: 627 ns per loop
Both Yann and I have tested 9.75*10^15
again and got the correct result, I also did 10^16
. We both ran it on 64bit GNU/Linux systems. So the issue I was encountering may have been due to boundary issues with the __cached_primes
method (which have been rewritten over the course of the patches). I kept the limit in since I haven't yet tested this thoroughly.
comment:20 Changed 11 years ago by
Description:  modified (diff) 

comment:21 followup: 22 Changed 11 years ago by
I've so far only tested a little (on 64bit Linux; dead old 32bit box still building Sage), but already found one wrong result with patch v3:
502000000000000 15296923904889 # 11475v3 502000000000000 15296922045596 # correct result, (at least) doublechecked
Note that 502*10^{12} is below 2^{49} (562.949.953.421.312).
As my machines are still busy, you could perhaps check if the newer version(s) give the correct result.
comment:22 followup: 23 Changed 11 years ago by
Works on sage.math with 11475v5:
sage: time prime_pi(502000000000000) 15296922045596 Time: CPU 723.40 s, Wall: 723.43 s
comment:23 Changed 11 years ago by
Replying to ohanar:
Works on sage.math with 11475v5:
sage: time prime_pi(502000000000000) 15296922045596 Time: CPU 723.40 s, Wall: 723.43 s
Yep, here too; with both v4 and v5 (64bit Linux).
comment:24 followup: 25 Changed 11 years ago by
580000000000000 17596211452162 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 581000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 582000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 583000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 584000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 585000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 586000000000000 0
Hmm, really?
comment:25 Changed 11 years ago by
Replying to leif:
580000000000000 17596211452162 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 581000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 582000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 583000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 584000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 585000000000000 0 ^CException KeyboardInterrupt in 'sage.functions.prime_pi.PrimePi.__pi' ignored 586000000000000 0Hmm, really?
Oops, thankfully easily fixed. Shouldn't occur in the next patch (whenever that is).
comment:26 followup: 27 Changed 11 years ago by
Status:  needs_review → needs_work 

Ok, here are my first 32bit tests (with patch v5):
sage t "devel/sage11475v5/sage/functions/prime_pi.pyx" ********************************************************************** File "/home/leif/Sage/sage4.7/devel/sage11475v5/sage/functions/prime_pi.pyx", line 126: sage: prime_pi(10**10) Expected: 455052511 Got: 498132213 ********************************************************************** 1 items had failures: 1 of 14 in __main__.example_1 ***Test Failed*** 1 failures. For whitespace errors, see the file /home/leif/.sage//tmp/.doctest_prime_pi.py [6.1 s]  The following tests failed: sage t "devel/sage11475v5/sage/functions/prime_pi.pyx" Total time for all tests: 6.1 seconds real 0m6.293s user 0m5.328s sys 0m0.760s
With long tests included, some unhandled overflows occur:
sage t long "devel/sage11475v5/sage/functions/prime_pi.pyx" Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored Exception OverflowError: 'long int too large to convert to int' in 'sage.functions.prime_pi.PrimePi.__sieve' ignored ********************************************************************** File "/home/leif/Sage/sage4.7/devel/sage11475v5/sage/functions/prime_pi.pyx", line 126: sage: prime_pi(10**10) Expected: 455052511 Got: 498132213 ********************************************************************** File "/home/leif/Sage/sage4.7/devel/sage11475v5/sage/functions/prime_pi.pyx", line 231: sage: for n in (42,41..32): prime_pi(2**n) # long time (21s on mod.math, 2011) Expected: 156661034233 80316571436 41203088796 21151907950 10866266172 5586502348 2874398515 1480206279 762939111 393615806 203280221 Got: 2072485352 555689055 4249350341 1916569322 829105263 2484680881 3345496393 1649125207 813719441 403824770 203621732 ********************************************************************** 2 items had failures: 1 of 14 in __main__.example_1 1 of 14 in __main__.example_4 ***Test Failed*** 2 failures. For whitespace errors, see the file /home/leif/.sage//tmp/.doctest_prime_pi.py [52.1 s]  The following tests failed: sage t long "devel/sage11475v5/sage/functions/prime_pi.pyx" Total time for all tests: 52.2 seconds real 0m52.318s user 0m49.923s sys 0m0.956s
This is on Ubuntu 7.10 :) (P4 Northwood, gcc 4.2.1) with Sage 4.7.
comment:27 followup: 28 Changed 11 years ago by
Unfortunate, but kind of expected. It'll probably take a day or two for me to set up a 32bit vm to try to debug the issue (it isn't blatantly obvious to me what is going wrong).
comment:28 Changed 11 years ago by
Replying to ohanar:
(it isn't blatantly obvious to me what is going wrong).
Yep, especially the "long int doesn't fit into int" reminds me of 16bit systems... 8/
comment:29 Changed 11 years ago by
Description:  modified (diff) 

Status:  needs_work → needs_review 
comment:30 followup: 31 Changed 11 years ago by
Hopefully that was the only issue with 32bit systems. Basically gmp wants a 32bit int for mpz_set_ui on 32bit systems, so I gave it what it wanted (this was also causing the issue with the __sieve
function, since I use the mpz_sqrt
function to calculate the argument).
I am currently doing some more thorough testing with large values on a 64bit machine, so that we can be sure there isn't any more breakage on the stable platform.
comment:31 Changed 11 years ago by
Replying to ohanar:
Hopefully that was the only issue with 32bit systems.
v6 passes long tests on the 32bit Ubuntu 7.10 machine (in about a minute). More to come.
comment:33 followup: 35 Changed 11 years ago by
Status:  needs_review → needs_work 

I have verified that prime_pi
gives correct output for i*10**12
with 0 <= i <= 1000
on x86_64 :). I am probably going to remove the bound flag and instead put a warning about larger values not being well tested if/when they are called.
Unfortunately, I have also come across an issue with solaris 10 sparc (t2.math). The root of the cause is that mpz_export
has the bit ordering arguments flipped. I am not aware of another way to get unsigned ints out of gmp integers, nor am I aware of a way to check the platform from cython. Also, I don't know if other platforms have this same issue (solaris86 or mac powerpc in particular), so if someone could test that, please let me know.
Changed 11 years ago by
Attachment:  trac_11475v8.patch added 

100% doctest coverage + slightly faster init + solaris/sparc fix
comment:35 followups: 36 37 Changed 11 years ago by
Replying to ohanar:
I have verified that
prime_pi
gives correct output fori*10**12
with0 <= i <= 1000
on x86_64 :).
Besides a couple of other values on individual machines, I've successfully computed pi(2^{k}) for k=0...53 and pi(10^{k}) for k=0...16 on each of three different machines (two of them 32bit; all with patch version 6).
Some computations are still in progress, but so far I could also verify the results of prime_pi(2**54)
and prime_pi(2**55)
(on at least one machine). I've also played a little with "legendres_formula()
".
I am probably going to remove the bound flag and instead put a warning about larger values not being well tested if/when they are called.
Yes, I also think we can or should drop any "artificial" limit now, though I must admit I haven't really looked at the code yet...
I wonder if legendres_formula()
shouldn't be renamed to legendre_phi()
 better names appreciated, but the latter ( \phi(x,a) ) seems to be quite established in literature, and legendre_phi
would be analogous to e.g. euler_phi
in Sage. (I know Wolfram and others call "it"  i.e., IMHO the method computing the respective function rather than the function itself  Legendre's Formula as well, but it's not that unambiguous.)
I'd also mention in its docstring it's called the partial sieve function.
comment:36 Changed 11 years ago by
Replying to leif:
Replying to ohanar:
I am probably going to remove the bound flag and instead put a warning about larger values not being well tested if/when they are called.
Yes, I also think we can or should drop any "artificial" limit now, though I must admit I haven't really looked at the code yet...
Ah, I see, you already removed the restriction in patch v8 (and also added warnings w.r.t. seemingly "endless" computations, which I also wanted to suggest).
There's a small typo in __call__()
's docstring (the 2**32
should read 2**64
):
This implementation uses unsigned 64bit ints and thus does not support `x >= 2**32`:: sage: prime_pi(2^64) Traceback (most recent call last): ... NotImplementedError: computation of prime_pi for x >= 2^64 is not implemented
Also, there's a mixture of both kinds of exponentiation operators (^
and **
) throughout the [doc]strings; I'm not sure if we shouldn't use just one notation.
comment:37 followup: 38 Changed 11 years ago by
Replying to leif:
Replying to ohanar:
I have verified that
prime_pi
gives correct output fori*10**12
with0 <= i <= 1000
on x86_64 :).Besides a couple of other values on individual machines,
Which machines have you had issues with? Everything I have tested with v8 has worked (this includes x86 and x86_64 linux, 64bit Snow Leopard, and Sparc Solaris 10).
I wonder if
legendres_formula()
shouldn't be renamed tolegendre_phi()
 better names appreciated, but the latter ( \phi(x,a) ) seems to be quite established in literature, andlegendre_phi
would be analogous to e.g.euler_phi
in Sage.
Makes sense to me, if no one has an objection to the idea, I'll make the change.
I'd also mention in its docstring it's called the partial sieve function.
Sounds good to me.
Replying to leif:
Also, there's a mixture of both kinds of exponentiation operators (
^
and**
) throughout the [doc]strings; I'm not sure if we shouldn't use just one notation.
Will do  probably ^
since it is more consumer friendly.
comment:38 Changed 11 years ago by
Replying to ohanar:
Replying to leif:
Replying to ohanar:
I have verified that
prime_pi
gives correct output fori*10**12
with0 <= i <= 1000
on x86_64 :).Besides a couple of other values on individual machines,
Which machines have you had issues with?
Oh, I didn't get any wrong results; I just meant I've also tested prime_pi(x)
with some "random" values (besides the mentioned powers of 2 and ten), but not systematically on all machines (all Linux boxes; Pentium4 Northwood, P4 Prescott and Core2).
comment:39 Changed 11 years ago by
Description:  modified (diff) 

Computation of prime_pi(2**56)
just finished with the correct result on one of the 32bit machines :) (took 2 days+).
[Updated description w.r.t. (now obsolete) limitation / potentially wrong results for inputs exceeding 2^{49} with earlier versions.]
comment:40 Changed 11 years ago by
So through the updates, I haven't been keeping legendres_formula
up to date. I am posting a patch that should fix the bugs that were present for the past few versions (primarily, calling it with a relatively small x
in comparison to a
would give an extremely wrong answer due to overflow issues). I have also changed the name, so it now is called either legendre_phi
or partial_sieve_function
.
The patch also makes all documentation use ^
, there are no longer any instances of **
.
The patch also touches up on __phi
and __phi32
 cleaning up code, adding a few comments, and makes a simple optimization to reduce boolean checks for small x
values (~10% speedup for large input).
I'm hoping this will be the final patch for this ticket, so that I don't go into the double digits for version numbers :).
Apply only trac_11475v9.patch
comment:41 Changed 11 years ago by
Description:  modified (diff) 

The new version comes in the right moment as the slow 32bit machine just finished computing prime_pi(2^57)
(successfully; still with v6 though). No separate timings, but pi(2^{54}) ... pi(2^{57}) took 7.5 days in total. :)
At first glance, there are only a few typos:
 There's a superfluous
+
in the computation ofself.__tabS[i]
(continuation line).  The 2^{49} remains renitent:
cdef uint64_t __pi(self, uint64_t x) except 1: r""" Returns :math:`\pi(x)` and assumes :math:`self.__maxSieve < x < 2^{49}` """
s/
number of primes/
number of positive integers/
:cdef uint64_t __phi(self, uint64_t x, uint64_t i): r""" Legendre's formula: returns the number of primes :math:`\leq` ``x`` that are not divisible by the first ``i`` primes """
comment:42 Changed 11 years ago by
Reviewers:  → Yann LaigleChapuy, Leif Leonhardy 

So far passed all [long] tests on Ubuntu 7.10 x86 (Sage 4.7) and Ubuntu 9.04 x86_64 (Sage 4.7.1.alpha3; patch to module_list.py
applies with 1 line offset).
That's of course a bit funny (I don't mind):
sage: legendre_phi(2^63,0) computation of legendre_phi for large x can take minutes, hours, or days depending on the size of x computation of legendre_phi for x >= 10^15 has not been thoroughly tested, be cautious of the result 9223372036854775808 sage:
We may also add "or weeks", but the current version is faster than the previous ones and much faster than the current version in Sage of course (not to mention its odd limitation).
There are opportunities for further improvements (e.g. copying the whole list of primes when we need more; perhaps some more comments / documentation), but these can IMHO be left to followup tickets.
In principle positive review from me (minus the typos), just hesitating to give it at this point without having studied the code more thoroughly.
comment:43 followups: 44 46 Changed 11 years ago by
Another corner case:
sage: legendre_phi(1000,10^10)  NotImplementedError Traceback (most recent call last) /home/leif/Sage/sage4.7.1.alpha3/devel/sage11475v9/<ipython console> in <module>() /home/leif/Sage/sage4.7.1.alpha3/local/lib/python2.6/sitepackages/sage/functions/prime_pi.so in sage.functions.prime_pi.legendre_phi (sage/functions/prime_pi.c:5888)() /home/leif/Sage/sage4.7.1.alpha3/local/lib/python2.6/sitepackages/sage/functions/prime_pi.so in sage.functions.prime_pi.legendre_phi (sage/functions/prime_pi.c:5517)() NotImplementedError: computing legendre_phi for a > prime_pi(2^321) not implemented sage:
(The message is a bit misleading by the way, since we limit x
to 64 bits, and therefore the result is independent of a >= 2^32
. If at all, I'd also give the numerical upper limit for a
, i.e. 203280221. But as I said, there isn't really any limit on a
.)
Also, don't try this at home:
sage: time legendre_phi(1000,10^8)
(I fortunately could kill the process before the almost freezed machine went out of swap space. Apparently computing the list of primes is uninterruptable, which worsens the situation.)
comment:44 Changed 11 years ago by
Replying to leif:
The message is a bit misleading by the way, since we limit
x
to 64 bits, and therefore the result is independent ofa >= 2^32
.
More precisely: independent of a >= prime_pi(2^32)
. ;)
comment:45 Changed 11 years ago by
I'd also use int_fast8_t
and uint_fast8_t
instead of int8_t
and uint8_t
, since size doesn't really matter there:

sage/functions/prime_pi.pyx
diff git a/sage/functions/prime_pi.pyx b/sage/functions/prime_pi.pyx
a b 32 32 include '../ext/stdsage.pxi' 33 33 include '../ext/interrupt.pxi' 34 34 35 from libc.stdint cimport int 8_t, uint8_t, uint32_t, uint64_t35 from libc.stdint cimport int_fast8_t, uint_fast8_t, uint32_t, uint64_t 36 36 from sage.rings.integer cimport Integer 37 37 from sage.rings.real_mpfr import RealLiteral, create_RealNumber, RR 38 38 from sage.rings.arith import nth_prime … … 134 134 cdef uint32_t __numPrimes 135 135 cdef uint32_t __maxSieve 136 136 cdef uint64_t __maxSieve64 137 cdef int 8_t[2310] __tabS138 cdef uint 8_t[1619] __smallPi137 cdef int_fast8_t[2310] __tabS 138 cdef uint_fast8_t[1619] __smallPi 139 139 140 140 def __dealloc__(self): 141 141 if self.__numPrimes:
comment:46 followup: 47 Changed 11 years ago by
Replying to leif:
Another corner case:
sage: legendre_phi(1000,10^10)  NotImplementedError Traceback (most recent call last) /home/leif/Sage/sage4.7.1.alpha3/devel/sage11475v9/<ipython console> in <module>() /home/leif/Sage/sage4.7.1.alpha3/local/lib/python2.6/sitepackages/sage/functions/prime_pi.so in sage.functions.prime_pi.legendre_phi (sage/functions/prime_pi.c:5888)() /home/leif/Sage/sage4.7.1.alpha3/local/lib/python2.6/sitepackages/sage/functions/prime_pi.so in sage.functions.prime_pi.legendre_phi (sage/functions/prime_pi.c:5517)() NotImplementedError: computing legendre_phi for a > prime_pi(2^321) not implemented sage:(The message is a bit misleading by the way, since we limit
x
to 64 bits, and therefore the result is independent ofa >= 2^32
. If at all, I'd also give the numerical upper limit fora
, i.e. 203280221. But as I said, there isn't really any limit ona
.)
Of course, I wrote the method as a wrapper for phi, so I was mainly thinking of what I needed to call that method. I can easily remove this bound.
Also, don't try this at home:
sage: time legendre_phi(1000,10^8)(I fortunately could kill the process before the almost freezed machine went out of swap space. Apparently computing the list of primes is uninterruptable, which worsens the situation.)
When fixing the above issue, I can fix this. This same issue will appear for very large x
for prime_pi
, this can be blamed on PARI  it stores the primes inefficiently and will use ~17GB for a sieve up to 2**321
(my code should use ~1GB at this size). I can implement a custom sieve if it this poses much of an issue (eventually though something needs to be done to fix prime_range
).
comment:47 followups: 48 49 Changed 11 years ago by
Replying to ohanar:
Replying to leif:
(I fortunately could kill the process before the almost freezed machine went out of swap space. Apparently computing the list of primes is uninterruptable, which worsens the situation.)
When fixing the above issue, I can fix this. This same issue will appear for very large
x
forprime_pi
, this can be blamed on PARI  it stores the primes inefficiently and will use ~17GB for a sieve up to2**321
.
Ouch. (Or should it read 1.7 GB, with ~8 bytes per prime? Otherwise there would be a real hardware limit  max. 2 or 4 GB address space  on 32bit machines and OSs I haven't yet met during the tests.)
I thought the huge memory consumption was mostly the fault of prime_range()
(and Python), and to some extent of having up to three or four lists or arrays of primes at the same time (PARI + Python + perhaps two Cython arrays while copying).
Would using Python ints (py_ints=True
parameter to prime_range()
) alleviate the situation in any way?
One could also request the primes in chunks rather than the whole interval at once, or implement some generator class or function that calls PARI.
Using sage_realloc()
would also reduce the (peak) physical memory requirement.
I can implement a custom sieve if this poses much of an issue (eventually though something needs to be done to fix
prime_range
).
Hmmm. I thought we could avoid using prime_range()
later, on a followup, e.g. by directly accessing PARI from Cython. But feel free to implement or use some other sieve / function.
At least potentially freezing the machine (which IMHO is an OS problem) isn't very userfriendly, especially in educational software I think...
comment:48 Changed 11 years ago by
Replying to leif:
Replying to ohanar:
This same issue will appear for very large
x
forprime_pi
, this can be blamed on PARI  it stores the primes inefficiently and will use ~17GB for a sieve up to2**321
.
That might have been the case with older PARI versions; if I do
sage: time pari.init_primes(2^32) Time: CPU 13.42 s, Wall: 18.47 s
the whole process just occupies about 283 MB (89 MB thereof basic Sage). And yes, all have been computed, as one can "verify" with e.g. the time it takes to get some primes around the current "end":
sage: prime_range(2^32,2^32+100) # immediate answer if init_primes(2^32) has been called in advance [4294967311, 4294967357, 4294967371, 4294967377, 4294967387, 4294967389]
The fatal things are commands like this:
sage: prime_range(2^30)[1]
which turns the relative prime offsets stored by PARI (hardly more than one byte per prime on average) into a list of boxed integers.
comment:49 Changed 11 years ago by
Replying to leif:
Using
sage_realloc()
would also reduce the (peak) physical memory requirement.
How long has sage had sage_realloc
? I could have sworn that it didn't exist a year ago.
I can implement a custom sieve if this poses much of an issue (eventually though something needs to be done to fix
prime_range
).Hmmm. I thought we could avoid using
prime_range()
later, on a followup, e.g. by directly accessing PARI from Cython. But feel free to implement or use some other sieve / function.
I have am working on v10, which will primarily change the __sieve
function to directly access PARI and use sage_realloc
. I have this working on 64bit linux right now:
sage: get_memory_usage() 844.1484375 sage: time legendre_phi(1000,10^8) 1 Time: CPU 22.55 s, Wall: 22.54 s sage: get_memory_usage() 1406.78515625
Obviously I haven't made all the changes that I need to with legendre_phi
, however it removes the potentially insane memory issues with using prime_range
. Worst case scenario for storage now:
sage: time legendre_phi(1000, 203280221) 1 Time: CPU 46.85 s, Wall: 46.86 s sage: get_memory_usage() 1975.58203125
comment:50 Changed 11 years ago by
Just for the record:
The second 32bit machine just successfully computed prime_pi(2^58)
(still patch version 6), which took about ten days on an already fully loaded system.
Changed 11 years ago by
Attachment:  trac_11475v10.patch added 

rewrote sieve (now goes directly through pari) + some significant changes to legendre_phi
comment:51 Changed 11 years ago by
Been busy, but I finally managed to get another patch (after running into quite a number of issues). This has some pretty significant changes  specifically to legendre_phi
and __sieve
.
legendre_phi
now has keyboard interrupt support :). Also, it removes the limit to a
as well as make some other improvements to speed for various special cases.
__sieve
was rewritten to remove the use of prime_range
and now works directly through PARI. Also sage_realloc
was added (I wasn't aware that sage had realloc
, I could have sworn that I tried using it before with no luck).
Apply only trac_11475v10.patch
comment:52 Changed 11 years ago by
Description:  modified (diff) 

Changed 11 years ago by
Attachment:  trac_11475v10_reviewer_1.sagelib.patch added 

Sage library patch. Apply on top of Andrew's v10. (Actually also based on Sage 4.7.1.alpha4.)
comment:53 Changed 11 years ago by
Description:  modified (diff) 

I've added a reviewer patch to Andrew's version 10:
 PARI's
diffptr
isuint8_t*
(actuallyunsigned char*
orbyteptr
); usinguint_fast8_t*
there would fail on any machine on whichsizeof(uint8_t) != sizeof(uint_fast8_t)
.  Use the more portable
ptrdiff_t
for "__diffptrOff
" (which will then also be smaller on 32bit machines).  Change order of libraries in
module_list.py
, as PARI also depends on GMP, to support dumber linkers (necessary for at least Cygwin I guess). stdint.h
is C99, so add the corresponding compiler flag.
The superfluous +
is still in, missed to fix that in my reviewer patch.
I haven't yet really looked at the rest of the new code (i.e., v10), only performed the standard tests (long
) on both Ubuntu 9.04 x86 and x86_64 (Sage 4.7 and Sage 4.7.1.alpha4, respectively), which passed (seemingly slightly faster, but I may be wrong ;) ).
The module_list.py
part of Andrew's v10 patch applies to Sage 4.7.1.alpha4, too, but with an offset of 26 lines, so should perhaps be rebased.
(Unfortunately my second 32bit machine apparently died, on which I mainly tested prime_pi()
for larger values previously.)
comment:54 Changed 11 years ago by
Description:  modified (diff) 

So how the prime list is created and managed was bothering me. Specifically, its initiation code uses a bit of a hack to figure out whether or not it was called by init_tables and also uses pari(n).primepi()
, which without #11741 can be quite buggy for larger n; it can consume a rather large amount of memory if the user gives a large input; and the user does not have any option on how big it is.
So I've completely rewritten the initiation method, it now recursively calls the pi method to determine the length of the prime list. I've also completely I've removed the caching part of the code, since we can quickly copy the primes over from pari's list, and instead added an optional second argument that allows the user to use additional memory if so desired. I've also tweaked the size of the smallPi list to give another significant speedup (roughly 50% faster than the last patch), as well as made a couple of simplifications in phi32.
Beyond those changes, I've refactored a lot of code, replaced cython code with python for the noncritical sections in order to improve readability, reverted the plot method to something very close to its original state (since we are no longer caching), and folded in Leif's reviewer patch. The code is now 552 lines versus the previous 625.
comment:56 Changed 11 years ago by
Status:  needs_review → needs_work 

According to the patchbot logs there are some issues with documentation formatting:
docstring of sage.functions.prime_pi.legendre_phi:39: WARNING: Duplicate explicit target name: "rao2011". docstring of sage.functions.prime_pi.partial_sieve_function:39: WARNING: Duplicate explicit target name: "rao2011". docstring of sage.functions.prime_pi.legendre_phi:39: WARNING: duplicate citation RAO2011, other instance in /storage/masiao/sage5.0.beta8/devel/sage/doc/en/reference/sage/functions/prime_pi.rst docstring of sage.functions.prime_pi.partial_sieve_function:39: WARNING: duplicate citation RAO2011, other instance in /storage/masiao/sage5.0.beta8/devel/sage/doc/en/reference/sage/functions/prime_pi.rst
This is basically because there should be only one copy of the bibliography entry in the file, rather than having it afresh in every docstring that cites it.
Also, shouldn't the class PrimePi derive from sage.symbolic.function.BuiltinFunction?, rather from the base Object class?
comment:57 Changed 11 years ago by
Status:  needs_work → needs_review 

Ok, both issues should be resolved  it should now work properly in the symbolic ring.
comment:60 Changed 11 years ago by
On the plus side, this fixes some of #6876. On the down side:
WARNING: we draw the plot of ``prime_pi`` as a stairstep function with explicitly drawn vertical lines where the function jumps. Technically Technically there should not be any vertical lines, but they make the graph look much better, so we include them. Use the option ``vertical_lines=False`` to turn these off.
I'm adding a fix for the latter, and a doctest for the former. Shouldn't affect anything.
comment:61 Changed 11 years ago by
Keywords:  sd40.5 added 

Reviewers:  Yann LaigleChapuy, Leif Leonhardy → Yann LaigleChapuy, Leif Leonhardy, William Stein, KarlDieter Crisman 
comment:62 Changed 11 years ago by
Description:  modified (diff) 

Patchbot: Apply trac_11475mk2.patch and trac_11475reviewer.patch.
Changed 11 years ago by
Attachment:  trac_11475reviewer.patch added 

comment:63 Changed 11 years ago by
William pointed out that this didn't fix #6876 quite how we want, so just fixing the typo.
comment:64 Changed 10 years ago by
Status:  positive_review → needs_work 

This fails on arando (Linux Ubuntu 12.04 i686, 32 bit):
sage t long force_lib devel/sage/sage/functions/prime_pi.pyx ********************************************************************** File "/var/lib/buildbot/build/sage/arando1/arando_full/build/sage5.1.beta4/devel/sagemain/sage/functions/prime_pi.pyx", line 123: sage: prime_pi(10^10) Expected: 455052511 Got: 70462980 ********************************************************************** File "/var/lib/buildbot/build/sage/arando1/arando_full/build/sage5.1.beta4/devel/sagemain/sage/functions/prime_pi.pyx", line 231: sage: for i in (32..42): prime_pi(2^i) # long time (13s on sage.math, 2011) Expected: 203280221 393615806 762939111 1480206279 2874398515 5586502348 10866266172 21151907950 41203088796 80316571436 156661034233 Got: 0 0 0 0 0 0 0 0 0 0 0 **********************************************************************
comment:66 Changed 10 years ago by
Status:  needs_work → needs_review 

comment:69 Changed 10 years ago by
Status:  needs_review → positive_review 

Jeroen: now it passes on arando. Can you check on hawk?
comment:70 Changed 10 years ago by
Milestone:  sage5.1 → sage5.2 

comment:71 Changed 10 years ago by
Merged in:  → sage5.2.beta1 

Resolution:  → fixed 
Status:  positive_review → closed 
comment:72 Changed 10 years ago by
Why does this ticket make random whitespace changes to sage/functions/all.py
? This is total BS. This just breaks patches on trac and wastes other people's time. For example
#11143 doesn't apply any more.
comment:73 Changed 10 years ago by
Merged in:  sage5.2.beta1 

Milestone:  sage5.2 → sage5.3 
Resolution:  fixed 
Status:  closed → new 
On bsd.math (OS X 10.6 x86_64), I still get
sage t long force_lib devel/sage/sage/modular/abvar/torsion_subgroup.py ********************************************************************** File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/devel/sagemain/sage/modular/abvar/torsion_subgroup.py", line 316: sage: G.multiple_of_order() Exception raised: Traceback (most recent call last): File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/bin/ncadoctest.py", line 1231, in run_one_test self.run_one_example(test, example, filename, compileflags) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/bin/sagedoctest.py", line 38, in run_one_example OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/bin/ncadoctest.py", line 1172, in run_one_example compileflags, 1) in test.globs File "<doctest __main__.example_9[7]>", line 1, in <module> G.multiple_of_order()###line 316: sage: G.multiple_of_order() File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/abvar/torsion_subgroup.py", line 375, in multiple_of_order f = A.hecke_polynomial(p) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/abvar/abvar.py", line 2239, in hecke_polynomial f = self._compute_hecke_polynomial(n, var=var) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/abvar/abvar.py", line 3800, in _compute_hecke_polynomial return sqrt_poly(self.modular_symbols().hecke_polynomial(n, var)) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/hecke/module.py", line 1460, in hecke_polynomial return self.hecke_operator(n).charpoly(var) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/hecke/hecke_operator.py", line 274, in charpoly return self.matrix().charpoly(var) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/hecke/hecke_operator.py", line 747, in matrix self.__matrix = self.parent().hecke_matrix(self.__n, *args, **kwds) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/hecke/algebra.py", line 589, in hecke_matrix return self.__M.hecke_matrix(n, *args, **kwds) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/hecke/module.py", line 1351, in hecke_matrix T = self._compute_hecke_matrix(n) File "/Users/buildbot/build/sage/bsd1/bsd_64_full/build/sage5.2.beta1/local/lib/python/sitepackages/sage/modular/hecke/submodule.py", line 239, in _compute_hecke_matrix return A.restrict(self.free_module(), check=check) File "matrix2.pyx", line 4202, in sage.matrix.matrix2.Matrix.restrict (sage/matrix/matrix2.c:23293) File "element.pyx", line 2736, in sage.structure.element.Matrix.__mul__ (sage/structure/element.c:19170) File "coerce.pyx", line 740, in sage.structure.coerce.CoercionModel_cache_maps.bin_op (sage/structure/coerce.c:6778) File "action.pyx", line 147, in sage.matrix.action.MatrixMatrixAction._call_ (sage/matrix/action.c:3098) File "matrix_rational_dense.pyx", line 1047, in sage.matrix.matrix_rational_dense.Matrix_rational_dense._matrix_times_matrix_ (sage/matrix/matrix_rational_dense.c:12115) File "matrix_rational_dense.pyx", line 1090, in sage.matrix.matrix_rational_dense.Matrix_rational_dense._multiply_over_integers (sage/matrix/matrix_rational_dense.c:12248) RuntimeError: Segmentation fault **********************************************************************
comment:74 followup: 75 Changed 10 years ago by
Do either of these cause a segfault?:
sage: prime_pi(2**15) 3512 sage: prime_pi(2**101) 172
Those are the only two calls made to prime_pi
made by that doctest.
comment:75 Changed 10 years ago by
Replying to ohanar:
Do either of these cause a segfault?:
sage: prime_pi(2**15) 3512 sage: prime_pi(2**101) 172
No.
comment:76 Changed 10 years ago by
Status:  new → needs_review 

Ok, OSX issue should now be fixed (all tests pass for me on bsd.math now).
comment:77 Changed 9 years ago by
Milestone:  sage5.11 → sage5.12 

comment:78 Changed 9 years ago by
Branch:  → u/ohanar/prime_pi 

Dependencies:  → #15138 
this patch has bitrotted, I'll upload a git branch in a moment
comment:79 Changed 9 years ago by
Commit:  → e3e33e62cc19bf93b5236e7d1177482bd52b8a9e 

Branch pushed to git repo; I updated commit sha1. New commits:
[changeset:e3e33e6]  Trac 11475 reviewer patch  very minor prime_pi plot doc improvement 
[changeset:a3a0edd]  fix doctest 
[changeset:ef57887]  remove trailing \'s and other little cleanup 
[changeset:ed701d3]  prime_pi: fix off by 1 mallocing issue 
[changeset:bc36b43]  5x speedup to prime_pi 
comment:80 Changed 9 years ago by
Milestone:  sage6.1 → sage6.2 

comment:81 Changed 9 years ago by
Milestone:  sage6.2 → sage6.3 

comment:82 Changed 9 years ago by
Status:  needs_review → positive_review 

The patchbot experienced some glitches on this ticket, so I tested it again; all tests pass and the documentation builds. From the comments, it appears that the ticket was already closed, then reopened because of a segfault on OS X 10.6 x86_64, and that this bug was subsequently fixed. Hence let's put it back to positive review.
comment:83 Changed 9 years ago by
Branch:  u/ohanar/prime_pi → e3e33e62cc19bf93b5236e7d1177482bd52b8a9e 

Resolution:  → fixed 
Status:  positive_review → closed 
based on sage4.7