Ticket #7539 (new enhancement)

Opened 3 years ago

Last modified 16 months ago

primesieve spkg

Reported by: GeorgSWeber Owned by: rohana
Priority: major Milestone: sage-5.10
Component: number theory Keywords:
Cc: kevin.stueve, robertwb, was, victor, leif Work issues:
Report Upstream: N/A Reviewers:
Authors: rohana Merged in:
Dependencies: Stopgaps:

Description (last modified by rohana) (diff)

This ticket was split away from #7013:

The goal of this ticket is to create an spkg for primesieve ( http://primesieve.googlecode.com) which provides the functionality of primesieve is a shared library.

Install  primesieve-3.4.p2.spkg.

Apply  trac7539_initial.patch and either  trac7539_standard.patch xor  trac7539_optional.patch.

Attachments

Leif_attachment1.zip Download (17.0 KB) - added by kevin.stueve 3 years ago.
Provided by Leif Leonhardy in his first email
Silva.c.additional.diff Download (535 bytes) - added by kevin.stueve 3 years ago.
Provided by Leif Leonhardy in his second email
Leif_attachment3.zip Download (10.5 KB) - added by kevin.stueve 3 years ago.
Provided by Leif Leonhardy in his third email
Silva.c.orig Download (17.1 KB) - added by mvngu 3 years ago.
original version of Silva.c
01_Silva.patch Download (6.2 KB) - added by mvngu 3 years ago.
first patch of Leif_attachment1.zip, i.e. Silva.c.diff0
02_Silva.patch Download (4.4 KB) - added by mvngu 3 years ago.
Leif_attachment1.zip, patch Silva.c.diff: apply all changes in Silva.c.diff that are not in Silva.c.diff0
03_Silva.patch Download (520 bytes) - added by mvngu 3 years ago.
add changes from Silva.c.additional.diff
04_Silva.patch Download (10.3 KB) - added by mvngu 3 years ago.
add changes from Silva.c.bounds-restriction-removed.diff
Silva.c Download (25.7 KB) - added by mvngu 3 years ago.
version of Silva.c after applying the previous 4 patches

Change History

comment:1 Changed 3 years ago by kevin.stueve

It is appropriate to give credit to the sources I used in making the tables of offsets between prime_pi and a logarithmic integral approximation. I used an asymptotic approximation to the logarithmic integral provided by Fredrik Johansson and the tables of prime_pi from  http://www.primefan.ru/stuff/primes/table.html. And the idea of combining a table of values of prime_pi and sieving for fast computation of the prime_pi was provided by the nth prime page at  http://primes.utm.edu/nthprime/

The primefan tables are extensive, and are from a combined effort of several individuals, including Andrey V. Kulsha, Anatoly F. Selevich, and Tomás Oliveira e Silva. It is likely that the total computation time required to generate these tables was many processor years.

As processor time (for the developer), hard-drive storage (for the client), and Internet bandwidth (for the client) become cheaper and more available, I would like to see published tables of prime_pi to expand (and to see resources put toward such computation). It would be nice for the number of values in the prime_pi tables in Sage to increase every few years as the computation of the tables becomes easier for the developer and the download and storage of the tables becomes easier for the client. Maybe several table sizes could be offered. A standard table could be the default for most users, but for those who are studying the distribution of prime numbers and desire speed over extra storage space on their hard-drive (and for bragging rights for Sage), larger tables could be offered. A web-service could even be offered to query a very large table of values of the prime counting function (the additional sieving could be done either on the server or client). Of course, it would be very important to have redundancy in such large calculations (the generation of the prime_pi tables), as there have been incorrect values of the prime counting function published several times in the past.

Kevin Stueve

comment:2 Changed 3 years ago by kevin.stueve

Leif Leonhardy sent some very useful contributions over the past days. Following is our correspondence.


*Friday December 4, 2009: Hi Kevin,

I've made some changes to [your version of] TOS's prime_sieve.c, making it more portable and removing the 4-byte pointer constraint (i.e. fixing your "-m64" problem).

The original version also contains an overflow bug on large intervals (>=236). (Another overflow still occurs [very] near to 264, because the sum of [the muted] main_base and numbers_per_segment might [or will] exceed 64 bits.) For use with Sage, the modulo-64* limitation on the intervals' bounds should be removed, too (easy, but it didn't disturb me yet ;-) ). I also fixed your parameter handling and output of the result.

If you intend to build a library rather than a stand-alone program the memory management should be rewritten (and globals removed). Running many sieving threads on multi-core CPUs doesn't make much sense because memory (consumption and traffic) is the bottleneck; the sieving primes lists ("main_lists") can't be shared among tasks (TOS's optimization is a "sequential" one in that sense).

Note that TOS's implementation, as he stated, was created mainly for illustration purposes - "simple but serious" - and wasn't intended (and doesn't claim) to be "fully optimized" or optimal in all aspects.

The default values for sieve (segment) and bucket size were appropriate in 2002; for contemporary hardware, both should be increased. I got best results with the (currently) maximal bucket size of 64KB, and sieve segment sizes of 256KB and 1MB on an Intel Pentium 4 Prescott (1MB L2 cache) and Intel Core2 with 6MB L2 cache per core-pair (e.g. E8400, Q9x50), respectively. Also, native code ("-m64") runs faster on x86_64 platforms, even with u32 as the sieve basetype; using 64- rather than 32-bit words for the sieve has probably greater effect on platforms like SPARC64, I guess. (Compile with "-DSIEVE_BASETYPE_U64" to get that.)

I've attached your version of prime_sieve.c ("Silva.c.orig", 09/04/09), too, since I'm not sure I fetched the latest. The first diff contains a (working) subset of the changes I made, the second contains all changes. A ChangeLog? (and adaption of TOS's "#if 0" test code) is in the pipeline... :/

Have fun, -Leif


*My reply, December 4th 2009 Sounds great [...]

On a Macbook pro, I have found that dualthreaded (simply calling TOS twice for two adjacent intervals) is about 1.71 times faster. I hope that it might be possible to improve that. [...] Kevin Stueve


*From Leif, December 4th 2009 Hi again,

I've just found another little 64bit-pointer bug in get_memory(), the attached diff contains only that patch.

-Leif


*From Leif, December 5th, 2009 Hi Kevin,

I've fixed the modulo-64 (or modulo-128 for 64-bit sieve words) restriction on interval bounds (by enlarging the sieved interval and subtracting the "extra" primes found if necessary); upper_bound==lower_bound is allowed now, too. [Adapting the Python code is up to you ;-)]

The majority of "#ifdef SIEVE_BASETYPE_U64"s has also been removed in favour of conditional macros and constants defined at the beginning.

-Leif

P.S.: By "many sieving threads" (currently: processes) I meant more than 2 or 4; but even with only few instances the processor's cache(s) might begin thrashing and we might run out of physical RAM; the behaviour can depend on not only the specific hardware but compile-time parameters (and does of course depend on the interval's lower bound), too. [Even small intervals above 4e18 require ~0.8GB (per process), intervals near 264 approx. 1.6GB. In worst case the machine starts swapping and multiple processes run slower, taking more time than a single would have taken. If I could get rid of the megs of sieving primes, I'd use the hundreds of cores on modern GPUs. :-) ]

Changed 3 years ago by kevin.stueve

Provided by Leif Leonhardy in his first email

Changed 3 years ago by kevin.stueve

Provided by Leif Leonhardy in his second email

Changed 3 years ago by kevin.stueve

Provided by Leif Leonhardy in his third email

Changed 3 years ago by mvngu

original version of Silva.c

comment:3 Changed 3 years ago by mvngu

From IRC:

06:44 < mvngu> The attachments look like a mess to me.
06:45 < SageWWW> They are supposedly diffs
06:45 < mvngu> I downloaded the first attachment, uncompressed it, and can't 
               work out which patches to apply in which order.
06:49 < SageWWW> "The first diff contains a (working) subset of the changes I 
                 made, the second contains all changes."Leif
06:50 < SageWWW> That is in his first email.  And he also included the original
06:51 < mvngu> So Silva.c.orig is the original version. And Silva.c is the 
               newer version with all changes in the diff's?
06:51 < SageWWW> Perhaps.  It would be easier to tell if Leif added himself to 
                 the authors section of the updated code
06:53 < mvngu> The diff's don't look like standard diff formats. At least when 
               I apply them with the command "patch", they failed to apply.
06:56 < SageWWW> It is extra confusing that Andrew Ohana and Leif Leonhardy 
                 both made their own updated version of Kevin Stueve's updated 
                 version of Oliviera e Silva's code.
06:56 < mvngu> I think if you diff Silva.c.orig against Silva.c, you get the 
               differences in the newer version (which is Silva.c).

The attachments look confusing to me. So I have attached what I think is the sequence of patches to apply on top of the original C code version Silva.c.orig. Starting from Silva.c.orig, apply the three patches in this order:

  1. 01_Silva.patch
  2. 02_Silva.patch
  3. 03_Silva.patch

After applying the above three patches on top of Silva.c.orig, you get the file Silva.c. I hope my attachments would resolve some of the confusion.

Changed 3 years ago by mvngu

first patch of Leif_attachment1.zip, i.e. Silva.c.diff0

Changed 3 years ago by mvngu

Leif_attachment1.zip, patch Silva.c.diff: apply all changes in Silva.c.diff that are not in Silva.c.diff0

Changed 3 years ago by mvngu

add changes from Silva.c.additional.diff

Changed 3 years ago by mvngu

add changes from Silva.c.bounds-restriction-removed.diff

Changed 3 years ago by mvngu

version of Silva.c after applying the previous 4 patches

comment:4 Changed 3 years ago by mvngu

From IRC:

07:57 < SageWWW> mvngu, what do you think Leif meant by a working subset of 
                 changes?
08:02 < mvngu> SageWWW: I think it's removing the 4-byte pointer constraint. 
               The working changes in the first diff now also work for 8-byte 
               pointers.
08:03 < SageWWW> It is still confusing.  Leif provided 4 patches total.
08:03 < mvngu> I know. You could ask for clarification on the ticket. 
08:03 < mvngu> Do you have an account on the trac server?
08:04 < SageWWW> I think you should modify your comment on the trac server to 
                 specifically state which of Leif's files corresponds to which 
                 of yours
08:04 < SageWWW> kevin.stueve
08:04 < mvngu> Let me modify the patches I have attached. Hang on...

I have deleted the previous 3 patches. In their place I attached 4 patches which correspond to the diff's contained in the attachments kevin.stueve. Starting from Silva.c.orig, apply the 4 patches in this order:

  1. 01_Silva.patch
  2. 02_Silva.patch
  3. 03_Silva.patch
  4. 04_Silva.patch

After applying the above four patches on top of Silva.c.orig, you get the file Silva.c. I hope my attachments would resolve some of the confusion.

comment:5 Changed 3 years ago by mvngu

08:57 < mvngu> So starting from Silva.c.orig, I applied Silva.c.diff0 which is 
               the same as 01_Silva.patch.
08:59 < mvngu> Call the file with 01_Silva.patch applied "Silva.c".
08:59 < mvngu> I then do a diff of Silva.c against the version of Silva.c in 
               Leif_attachment1.zip.
09:00 < mvngu> That produce the patch file 02_Silva.patch.

comment:6 Changed 3 years ago by kevin.stueve

  • Cc victor added

comment:7 Changed 3 years ago by kevin.stueve

One of the aspects of TOS's prime_sieve.c is that it has an initialization phase for an interval. I'm not sure if the code already does this, but we should make sure to think about not repeating initialization more than needed. Leif, I think you understand TOS's prime_sieve.c code better than me, what do you think?
Kevin Stueve

comment:8 Changed 3 years ago by leif

  • Cc leif added

comment:9 Changed 3 years ago by leif

WARNING: Do not use the current version of "Silva.c" to count primes in intervals beyond approx. 18302910352e9 (assuming a sieve segment size of 2MB, i.e. _sieve_bits_log2_=24; with lower segment sizes, overflows occur later).

Unfortunately, these early overflows do not lead to run-time errors (e.g. segmentation faults) but unsuspiciously wrong results. (run-time) errors do occur much later due to other overflow conditions.?

Note also that you currently must not use 1 as the interval's upper bound or (little worse) 2 as its lower bound, because prime counts will be wrong by one (1 is treated as prime while 2 is not).

Nevertheless, Happy New Decade...

-Leif

comment:10 Changed 3 years ago by leif

I think I've now fixed all overflow conditions (I did find yet another one) such that it works up to 264-1 under all circumstances.

Going to clean up the code and then perhaps release a final stand-alone version...

There's currently some overflow checking overhead which could be reduced for "safe" intervals but I'll most probably not remove it before writing a library version (which will be optimized for other special cases, too).

Anyway, the improved version is faster than the original... :)

comment:11 follow-up: ↓ 16 Changed 3 years ago by leif

An initial portable version of LMOnew/findn runs on x86_64 in 64-bit native mode.

Now going to extend its domain further up to 264-1...

[It could be helpful to have confirmed values of pi(x) between 1844e16 and 264. Anyone?]

-Leif

comment:12 follow-up: ↓ 14 Changed 3 years ago by kevin.stueve

Thomas R. Nicely's table of pi(x) at  http://sage.math.washington.edu/home/kstueve/T_R_NICELY/ has a granularity of 1e10 up to 2e16. So you really only need values between 2e16 and 264.
Kevin Stueve

comment:13 Changed 3 years ago by kevin.stueve

Correction: 1e10 or better

comment:14 in reply to: ↑ 12 Changed 3 years ago by leif

Replying to kevin.stueve:

So you really only need values between 2e16 and 264.

Only? There was no decimal point missing in 1844... ;-)

comment:15 Changed 3 years ago by kevin.stueve

Sorry. Right now you have the values:

x pi(x)

18440 00000 00000 00000 : 425 50425 77541 37607
18446 74407 37095 51616 : 425 65628 40352 17743
18450 00000 00000 00000 : 425 72967 93423 72554

from  http://www.primefan.ru/stuff/primes/table.html and  http://www.ieeta.pt/~tos/bib/5.4.pdf.

You could sieve part of this interval for more values. You could contact one of the authors of one of the variants of combinatorial method asking for other values (Such as Gourdon, TOS, etc).

Another possibility is using a O(x1/2 log(x)) algorithm that computes the parity (even or odd) of the prime counting function, for a limited amount of verification of values of the prime counting function. This algorithm is described by Terence Tao and others at the polymath project at  http://michaelnielsen.org/polymath1/index.php?title=Prime_counting_function and by Henri Lifchitz at  http://web.archive.org/web/20070325064748/http://ourworld.compuserve.com/homepages/hlifchitz/Henri/us/ParPius.htm

Gourdon, an author of a variant of the combinatorial prime_pi method links to this page at  http://numbers.computation.free.fr/Constants/Primes/countingPrimes.html, however Gourdon's link to Henri Lifchitz's page is dead and you have to use the wayback machine. At Lifchitz's page is his paper on calculating the parity of prime_pi and a program that works up to (unfortunately only) 1e19, and seems to be a little slow.

Gourdon has a pix program released at  http://numbers.computation.free.fr/Constants/Primes/Pix/contributepixtable.html that he used for his distributed pix project. I'm not sure what range is allowed, how long computation takes, or if the code is open source. But it might work for you. There is also a table of 7702/9001 values of pi(x) computed in the range 1000e14 -> 10000e14 available from Gourdon's pix project at  http://numbers.computation.free.fr/Constants/Primes/Pix/resultstable.php. Not the range you were asking for, but still worth noting. Take this data and program with a grain of salt- although Gourdon is surely a better programmer and mathematician than myself, he had to halt his pix program when he found his code was off by one in the global checks for prime_pi(1023)

TOS will probably release more values of prime_pi up to 4e18 after his Golbach conjecture verification project is complete, but since these aren't in the range you need, I would recommend contacting Gourdon or TOS directly asking for values for verification.

Kevin Stueve

comment:16 in reply to: ↑ 11 Changed 3 years ago by leif

Replying to leif:

Now going to extend its domain further up to 264-1...

	pi(10000000000000000000)=234057667276344607	# ok
	pi(10000100000000000000)=234059953037027338     # ok (sieved)
	pi(10001000000000000000)=234080524854508201
	pi(10002000000000000000)=234103382381000184
	pi(10003000000000000000)=234126239852389526
	pi(10004000000000000000)=234149097272730775
	pi(10005000000000000000)=234171954644081388
	pi(10006000000000000000)=234194811963658747
	pi(10007000000000000000)=234217669226973010
	pi(10008000000000000000)=234240526435037954
	pi(10009000000000000000)=234263383594121363
	pi(10010000000000000000)=234286240703110411	# ok
	pi(18440000000000000000)=425504257754137607	# ok
	pi(18441000000000000000)=425526800039801658
	pi(18442000000000000000)=425549342293840481
	pi(18443000000000000000)=425571884521908383
	pi(18444000000000000000)=425594426720237587
	pi(18445000000000000000)=425616968892901351
	pi(18446000000000000000)=425639511036514638
	pi(18446100000000000000)=425641765249620069
	pi(18446200000000000000)=425644019461402605
	pi(18446300000000000000)=425646273676018091
	pi(18446400000000000000)=425648527886817884
	pi(18446500000000000000)=425650782098479434
	pi(18446600000000000000)=425653036308859193
	pi(18446700000000000000)=425655290520421050
	pi(18446744073709551615)=425656284035217743	# ok

Happy validating... ;-)

"You just need a computer with a pentium or equivalent chip, turning on windows 95, 98 or NT. ..." [Xavier Gourdon about his "fastpix11"]

(Still cleaning up the code.)

-Leif

comment:17 Changed 3 years ago by kevin.stueve

Important update: "WARNING! The data for x = 6.5402·1015 and for x = 6.6808·1015 was incorrect before February 21, 2010."  http://www.primefan.ru/stuff/primes/table.html

We need to check our data.

comment:18 Changed 17 months ago by rohana

  • Summary changed from primes.p0.spkg with "prime_sieve.c" functionality to primesieve spkg
  • Description modified (diff)
  • Authors changed from rohana, GeorgSWeber to rohana

Considering that Kevin decided to go with primesieve over Silva's "prime_sieve.c", I think that it is appropriate to use this ticket to discuss the primesieve spkg (since the functionality is simply a superset of "prime_sieve.c").

I've created an initial spkg, although for the moment implicitly assumes that gcc >= version 4.2.

comment:19 Changed 16 months ago by rohana

  • Description modified (diff)

I've worked on this a little, and have posted an updated spkg. Unfortunately there are some major issues with primesieve on big endian systems. I have added a patch that stops the vast majority of the segfaults, however, there is still the occasional segfault. Even without segfaulting, the library gives very incorrect values on big endian systems.

I've also posted a few patches, one that provides a cython wrapper, and a couple that sets up the package as either standard or optional. All of these were based off of 5.0.prealpha1, so they may have some fuzz factors to them (especially the later two).

Note: See TracTickets for help on using tickets.