Search code examples
python-3.xmontecarlonon-uniform-distribution

How to adjust for non-uniform sampling (log-scale/polar) in Monte Carlo integration?


I am trying to perform Monte Carlo integration of a function, but sample non-uniformly. I need this method to work for both logarithmic scale and for integration in polar coordinates since I will then combine the two and use polar with log-scale sampling in radius.

I wrote a testing script that tries to

  • integrate a 2D gaussian in polar coordinates (which should equal to pi)

  • integrate y(x) = x in log-scale from 10**-2 to 10**7 (which should equal to ~0.5*10 ** 14)

For testing purposes, I complement the calculation with a uniform cartesian coordinate-based Monte Carlo that works. It is the non-uniformity of the sample that shifts my results.

import numpy as np


def function_to_integrate(x, y):
    return np.exp(-x**2 - y**2)


def polar_MC(polar):
    size = 100000
    integral = 0.
    integration_radius = 4.
    if polar:
        for _ in range(size):
            r = np.random.random()*integration_radius
            phi = np.random.random()*2.*np.pi
            x = r*np.cos(phi)
            y = r*np.sin(phi)
            jacobian_MC_polar = 1.
            integral += function_to_integrate(x, y) * jacobian_MC_polar
        integral = integral * np.pi * integration_radius**2 / size
    else:
        for _ in range(size):
            length = 2. * integration_radius
            x = np.random.random()*length - length/2.
            y = np.random.random()*length - length/2.
            integral += function_to_integrate(x, y)
        integral = integral * length**2 / size
    print('POLAR: True integral should be pi ', '; MC:', integral, polar)


def log_MC(log):
    size = 10000
    integral = 0.
    if log:
        for _ in range(size):
            x = np.random.uniform(-2, 7.)
            jacobian_MC_log = 1.
            integral += 10**x * jacobian_MC_log
    else:
        for _ in range(size):
            x = np.random.uniform(10**-2, 10**7)
            integral += x
    integral = integral*10**7 / size
    print('LOG: True integral should be 0.5*10**7*10**7 = 5*10**13; MC:', integral/10**13, '* 10**13', log)


polar_MC(polar=True)
polar_MC(polar=False)

log_MC(log=True)
log_MC(log=False)

I am unable to get the correct result from the polar and log-scale Monte Carlo, how should I set the jacobian_MC in order for this to work? Or am I doing something else wrong?

I have tried using the standard Jacobians (r for polar and r*np.log(10) for logarithmic) and that did not help.

With the Jacobians set to 1, I am getting

POLAR: True integral should be pi  ; MC: 11.041032315593327 True
POLAR: True integral should be pi  ; MC: 3.108344559871783 False
LOG: True integral should be 0.5*10**7*10**7 = 5*10**13; MC: 0.48366198481209793 * 10**13 True
LOG: True integral should be 0.5*10**7*10**7 = 5*10**13; MC: 5.003437412553992 * 10**13 False

Increasing sampling does not help, the results is close to being converged.

What probability distribution should I divide the sampled points with?


Solution

  • You got both Jacobian and normalization parts wrong for polar integration

    Here is the correct code, Python 3.10, Win x64

    import numpy as np
    
    rng = np.random.default_rng()
    
    def integrand(x: np.float64, y: np.float64) -> np.float64:
        r = np.sqrt(x*x + y*y)
        jacobian = r
        return jacobian * np.exp(-r*r)
    
    def sample_xy(R: np.float64):
        r = R * rng.random()
        phi = 2.0*np.pi*rng.random()
    
        return r*np.cos(phi), r*np.sin(phi)
    
    N = 1000000
    R = 100.0
    
    s: np.float64 = 0.0
    
    for k in range(0, N):
        x,y = sample_xy(R)
    
        s += integrand(x, y)
    
    print(s/N * 2.0*np.pi*R)
    

    it consistently prints values around 3.14:

    3.155748795359562
    3.14192687470938
    3.161890183195259
    

    UPDATE

    This is not perimeter or area thing. This is how you make Probability Density Function (PDF).

    So you have f(r)

    f(r) = r e-r2

    and integrals

    I = S02 pi d phi S0R dr f(r)

    You want to use Monte Carlo. It means sampling phi and sampling r. In order to sample something (anything!) you HAVE TO HAVE PDF (positive, properly normalized to 1 etc).

    So lets start with integral over phi

    Iphi = S02 pi d phi.

    I'll multiply and divide it by 2 pi.

    Iphi = 2 pi S02 pi d phi/(2 pi).

    which makes proper PDF under the integration sign

    PDF(phi) = d phi/(2 pi)

    and proper sampling

    phi = 2 pi random()

    But what's left is 2 pi upfront which you have to carry out to the front of the integral. This is how it works - you make normalized to 1 PDF under the integral, and whatever constant is there you account for it later, after whole sampling routine. It is not area nor perimeter, it is PDF construction and normalization

    Same for radial integral

    IR = S0R dr = R S dr/R.

    PDF(r) = dr/R, so you could sample r = R random(), but you have to carry R upfront to final calculation step.