Suppose I want to allocate an array of integers to store all the prime numbers less than some N
. I would then need an estimate for the array size, E(N)
. There is mathematical function that gives the exact number of primes below N, it's the Prime-counting function - pi(n)
. However, it looks impossible to define the function in terms of elementary functions.
There exist some approximations to the function, but all of them are asymptotic approximations, so they can be either above or below the true number of primes and cannot in general be used as the estimate E(N)
.
I've tried to use tabulated values of pi(n)
for certain n
like power-of-two and interpolate between them. However I noticed that the function pi(n)
is convex, so the interpolation between sparse table points may accidentally yield values of E(n)
below true pi(n)
that may result in buffer overflow.
I then decided to exploit the monotonic nature of pi(n)
and use the table values of pi(2^(n+1))
as an far upper estimate for E(2^n)
an interpolate between them this time.
I still feel not completely sure that for some 2^n < X < 2^(n+1)
an interpolation between pi(2^(n+1))
and pi(2^(n+2))
would be the safe upper estimate. Is it correct? How do I prove it?
Use an upper bound. There are a number to choose from, each more complicated but tighter. I call this prime_count_upper(n)
since you want a value guaranteed to be greater than or equal to the number of primes under n
. See Chebyshev, Rosser and Schoenfeld, Dusart 1999, Dusart 2010, Axler 2014, and Büthe 2015. R&S is simple and not terrible: π(x) <= x/(log(x)-3/2) for x >= 67
but Dusart gives better ones for larger values. Either way, no tables or original research needed.