I am trying to do a simulation in Stata with a random sample of 10000 for (i) the variable X with pdf
f(x) = 2*x*exp(-x^2), X>0
, and (ii) Y=X^2
I worked out the cdf of F to be 1-exp(-x^2)
, so the inverse of F is sqrt(-ln(1-u).
I used the following code in Stata:
(1)
clear
set obs 10000
set seed 527665
gen u= runiform()
gen x= sqrt(-ln(1-u))
histogram x
summ x, detail
(mean 0.88, sd 0.46)
(2)
clear
set obs 10000
set seed 527665
gen u= runiform()
gen x= (sqrt(-ln(1-u)))^2
summ x, detail
(mean 0.99, sd 0.99)
(3)
clear
set obs 10000
set seed 527665
gen u= rexponential(1)
gen x= 2*u*exp(-(u^2))
summ x, detail
(mean 0.49, sd 0.28)
(4)
clear
set obs 10000
set seed 527665
gen v= runiform()
gen u=1/v
gen x= 2*u*exp(-(u^2))
histogram x
summ x, detail
(mean 0.22, sd 0.26)
My queries are: (i) (1) and (2) are based on the probability integral transformation, which I have come across but do not understand. If (1) and (2) are valid approaches, what is the intuition behind this, (ii) the output for (3) does not seem correct; I am not sure if I am applying the rexponential function correctly, and what is the scale parameter (there seems to be no explanation on this in stata help) (iii) the output for (4) also does not seem correct, and I was wondering why this approach is flawed.
Thanks
Well, what you worked out as distribution looks ok to me
If
PDF(x) = 2 x exp(-x2), x in [0...Infinity) then
CDF(x) = 1 - exp(-x2)
which means it is basically square root of exponentially distributed RV. Exponential distribution sampling is done
using -ln(1-u)
or -ln(u)
I don't have Stata, just looking at the code
(1) looks ok, you sample exponential and get square root of it
(2) looks like you're sampling square root of exponential and immediately square it back. You will get back exponential, I believe
(3) I don't know what it supposed to mean, exponent of squared exponentials? Should be
clear
set obs 10000
set seed 527665
gen u = rexponential(1)
gen x = sqrt(u)
summ x, detail
rexponential() is the same as -ln(1-runiform())
(4) Does not make sense. Exponent from squared uniform?
I quickly wrote simple Python code for illustration
import numpy as np
import matplotlib.pyplot as plt
x = np.random.random(100000) // uniform in [0...1)
xx = np.sqrt(-np.log(1.0-x)) // -log(1-x) is exponential, then square root
q = np.linspace(0.0, 3.0, 101)
z = 2.0*q*np.exp(-q*q)
n, bins, patches = plt.hist(xx, 50, density=True, facecolor='g', alpha=0.75)
plt.plot(q, z, 'r-')
plt.show()
with picture