Search code examples
pythonperformancepi

How to increase the performance for estimating `Pi`in Python


I have written the following code in Python, in order to estimate the value of Pi. It is called Monte Carlo method. Obviously by increasing the number of samples the code becomes slower and I assume that the slowest part of the code is in the sampling part. How can I make it faster?

from __future__ import division
import numpy as np

a = 1
n = 1000000

s1 = np.random.uniform(0,a,n)
s2 = np.random.uniform(0,a,n)

ii=0
jj=0

for item in range(n):
    if ((s1[item])**2 + (s2[item])**2) < 1:
        ii = ii + 1

print float(ii*4/(n))

Do you suggest other (presumably faster) codes?


Solution

  • The bottleneck here is actually your for loop. Python for loops are relatively slow, so if you need to iterate over a million items, you can gain a lot of speed by avoiding them altogether. In this case, it's quite easy. Instead of this:

    for item in range(n):
        if ((s1[item])**2 + (s2[item])**2) < 1:
            ii = ii + 1
    

    do this:

    ii = ((s1 ** 2 + s2 ** 2) < 1).sum()
    

    This works because numpy has built-in support for optimized array operations. The looping occurs in c instead of python, so it's much faster. I did a quick test so you can see the difference:

    >>> def estimate_pi_loop(x, y):
    ...     total = 0
    ...     for i in xrange(len(x)):
    ...         if x[i] ** 2 + y[i] ** 2 < 1:
    ...             total += 1
    ...     return total * 4.0 / len(x)
    ... 
    >>> def estimate_pi_numpy(x, y):
    ...     return ((x ** 2 + y ** 2) < 1).sum()
    ... 
    >>> %timeit estimate_pi_loop(x, y)
    1 loops, best of 3: 3.33 s per loop
    >>> %timeit estimate_pi_numpy(x, y)
    100 loops, best of 3: 10.4 ms per loop
    

    Here are a few examples of the kinds of operations that are possible, just so you have a sense of how this works.

    Squaring an array:

    >>> a = numpy.arange(5)
    >>> a ** 2
    array([ 0,  1,  4,  9, 16])
    

    Adding arrays:

    >>> a + a
    array([0, 2, 4, 6, 8])
    

    Comparing arrays:

    >>> a > 2
    array([False, False, False,  True,  True], dtype=bool)
    

    Summing boolean values:

    >>> (a > 2).sum()
    2
    

    As you probably realize, there are faster ways to estimate Pi, but I will admit that I've always admired the simplicity and effectiveness of this method.