How to generate the 1000th prime in python?
Are you sure you are checking for primes correctly? A typical solution is to have a separate "isPrime" function you know that works.
def isPrime(num):
i = 0
for factor in xrange(2, num):
if num%factor == 0:
return False
return True
(There are ways to make the above function more effective, such as only checking odds, and only numbers below the square root, etc.)
Then, to find the n'th prime, count all the primes until you have found it:
def nthPrime(n):
found = 0
guess = 1
while found < n:
guess = guess + 1
if isPrime(guess):
found = found + 1
return guess
A couple of problems are obvious. First, since you're starting at 11, you've already skipped over the first 5 primes, so count should start at 5.
More importantly, your prime detection algorithm just isn't going to work. You have to keep track of all the primes smaller than i for this kind of simplistic "sieve of Eratosthanes"-like prime detection. For example, your algorithm will think 11 * 13 = 143 is prime, but obviously it isn't.
PGsimple1 here is a correct implementatioin of what the prime detection you're trying to do here, but the other algorithms there are much faster.
Let's see.
count = 1
i = 3
while count != 1000:
if i%2 != 0:
for k in range(2,i):
if i%k == 0: # 'i' is _not_ a prime!
print(i) # ??
count += 1 # ??
break
i += 1 # should be one space to the left,
# for proper indentation
If i%k==0
, then i
is not a prime. If we detect that it's not a prime, we should (a) not print it out, (b) not increment the counter of found primes and (c) we indeed should break out from the for
loop - no need to test any more numbers.
Also, instead of testing i%2
, we can just increment by 2
, starting from 3
- they will all be odd then, by construction.
So, we now have
count = 1
i = 3
while count != 1000:
for k in range(2,i):
if i%k == 0:
break
else:
print(i)
count += 1
i += 2
The else
after for
gets executed if the for
loop was not broken out of prematurely.
It works, but it works too hard, so is much slower than necessary. It tests a number by all the numbers below it, but it's enough to test it just up to its square root. Why? Because if a number n == p*q
, with p
and q
between 1
and n
, then at least one of p
or q
will be not greater than the square root of n
: if they both were greater, their product would be greater than n
.
So the improved code is:
from math import sqrt
count = 1
i = 1
while count < 1000:
i += 2
for k in range(2, 1+int(sqrt(i+1))):
if i%k == 0:
break
else:
# print(i) ,
count += 1
# if count%20==0: print ""
print i
Just try running it with range(2,i)
(as in the previous code), and see how slow it gets. For 1000 primes it takes 1.16 secs, and for 2000 – 4.89 secs (3000 – 12.15 ses). But with the sqrt
it takes just 0.21 secs to produce 3000 primes, 0.84 secs for 10,000 and 2.44 secs for 20,000 (orders of growth of ~ n2.1...2.2
vs. ~ n1.5
).
The algorithm used above is known as trial division. There's one more improvement needed to make it an optimal trial division, i.e. testing by primes only. An example can be seen here, which runs about 3x faster, and at better empirical complexity of ~ n1.3
.
Then there's the sieve of Eratosthenes, which is quite faster (for 20,000 primes, 12x faster than "improved code" above, and much faster yet after that: its empirical order of growth is ~ n1.1
, for producing n
primes, measured up to n = 1,000,000 primes):
from math import log
count = 1 ; i = 1 ; D = {}
n = 100000 # 20k:0.20s
m = int(n*(log(n)+log(log(n)))) # 100k:1.15s 200k:2.36s-7.8M
while count < n: # 400k:5.26s-8.7M
i += 2 # 800k:11.21-7.8M
if i not in D: # 1mln:13.20-7.8M (n^1.1)
count += 1
k = i*i
if k > m: break # break, when all is already marked
while k <= m:
D[k] = 0
k += 2*i
while count < n:
i += 2
if i not in D: count += 1
if i >= m: print "invalid: top value estimate too small",i,m ; error
print i,m
The truly unbounded, incremental, "sliding" sieve of Eratosthenes is about 1.5x faster yet, in this range as tested here.