Search code examples
pythonnumpytransformsamplinginverse

How to correct error when sampling Cauchy distribution


I'm trying to sample values from a Cauchy distribution, using what is called (I think) inverse transform sampling. My program is as follows:

from numpy import*
from matplotlib.pyplot import*
from scipy.interpolate import interp1d

def f(x):
    return 1 / (pi*(1+x**2)) # Cauchy

x = linspace(-6,+6,100)
dx = x[1]-x[0]
cdf = cumsum(f(x))*dx

plt.plot(x,cdf)
plt.xlabel('X')
plt.ylabel('CDF')
plt.show()

from_cdf_to_x = interp1d(cdf, x)
cdf_new = np.random.uniform(0,1,100)
x_sampled = from_cdf_to_x(cdf_new) 

The plot looks fine, but I get the following error in the 2nd part which I don't understand: A value in x_new is above the interpolation range. What does it mean, in my case, and what would be the proper way to correct it? No matter how much I search, I don't see what is not correct in the program I made...


Solution

  • The range of your array cdf_new is from 0 to 1.0, but the Cauchy cumulative distribution asymptotically approaches 1 without reaching it, thus the interpolation doesn't cover the full range of the uniform distribution. You need to reduce the top range of cdf_new and/or increase the range of x when you generate the Cauchy cumulative distribution

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.interpolate import interp1d
    
    def f(x):
        return 1 / (np.pi*(1+x**2)) # Cauchy
    
    x = np.linspace(-100,+100,1000)
    dx = x[1]-x[0]
    cdf = np.cumsum(f(x))*dx
    
    plt.plot(x,cdf)
    plt.xlabel('X')
    plt.ylabel('CDF')
    plt.show()
    
    from_cdf_to_x = interp1d(cdf, x)
    cdf_new = np.random.uniform(0.05,0.95,10000)
    x_sampled = from_cdf_to_x(cdf_new)
    plt.hist(x_sampled, bins=20)
    plt.show()
    

    Here is the normal distribution that you were expecting:

    enter image description here