I have a complicated (non standard) distribution function that I want to sample to generate simulated data points using the inverse cdf technique. For the sake of this example I will consider a Gaussian distribution
var=100
def f(x,a):
def g(y):
return (1/np.sqrt(2*np.pi*var))*np.exp(-y**2/(2*var))
b,err=integrate.quad(g,-np.inf,x)
return b-a
I want to generate values between a=[0,1]
, a=np.linspace(0,1,10000,endpoint=False)
and use scipy.optimize.fsolve
to solve for x for each a
.
I have two questions:
How to use fsolve
for an array of values a
?
fsolve
takes an initial guess x0
, how to pick a good guess value?
Here's how you do it, I replaced 10000 with 10 as it's going to take a while. My initial guess is just 0, and I set it to the previous iteration for the next guess as it should be quite close to the solution. You can further bound this if you want so it's strictly above it.
As a side comment this kind of sampling for complicated distributions isn't really feasible, as computing the cdf can be rather difficult. There are other sampling techniques to address these issues such as Gibbs sampling, Metropolis Hastings, etc.
var = 100
def f(x, a):
def g(y):
return (1/np.sqrt(2*np.pi*var))*np.exp(-y**2/(2*var))
b, err = sp.integrate.quad(g, -np.inf, x)
return b - a
a = np.linspace(0, 1, 10, endpoint=False)[1:]
x0 = 0
for a_ in a:
xi = sp.optimize.fsolve(f, x0 + 0.01, args=(a_,))[0]
print(xi)
x0 = xi
[EDIT] It seems to get stuck near 0, adding a small number fixes it, I'm not sure why as I don't know how fsolve
works.