I’m using the acceptance-rejection method for beta distribution with g(x) = 1, 0 ≤ x ≤ 1. The function is: f(x) = 100x^3(1-x)^2.
I want to create an algorithm to generate data from this density function.
How can I estimate P(0 ≤ X ≤ 0.8) with k = 1000 repetitions (n=1000)? How can I solve this in R?
I already have:
beta.rejection <- function(f, c, g, n) {
naccepts <- 0
result.sample <- rep(NA, n)
while (naccepts < n) {
y <- runif(1, 0, 0.8)
u <- runif(1, 0, 0.8)
if ( u <= f(y) / (c*g(y)) ) {
naccepts <- naccepts + 1
result.sample[n.accepts] = y
}
}
result.sample
}
f <- function(x) 100*(x^3)*(1-x)^2
g <- function(x) x/x
c <- 2
result <- beta.rejection(f, c, g, 10000)
for (i in 1:1000){ # k = 1000 reps
result[i] <- result[i] / n
}
print(mean(result))
You are close, but a few problems:
1) typo with naccepts
vs. n.accepts
2) If you aren't going to hard-wire in g
then you shouldn't hard-wire in runif()
as the function which generates random variables which are distributed according to g
. The function rejection
(why hard-wire in beta
?) should also be passed a function which is capable of generating the appropriate random variables.
3) u
should be drawn from [0,1]
not [0,0.8]
. The 0.8
should have no part in the generation of the values, just their interpretation.
4) c
need to be an upper bound for f(y)/g(y)
. 2 is too small. Why not take the derivative of f
to find it's max? 3.5 works. Also -- c
isn't a good name for a variable in R (because of the function c()
). Why not call it M
?
Making these changes yields:
rejection <- function(f, M, g, rg,n) {
naccepts <- 0
result.sample <- rep(NA, n)
while (naccepts < n) {
y <- rg(1)
u <- runif(1)
if ( u <= f(y) / (M*g(y)) ) {
naccepts <- naccepts + 1
result.sample[naccepts] = y
}
}
result.sample
}
f <- function(x) 100*(x^3)*(1-x)^2
g <- function(x) 1
rg <- runif
M <- 3.5 #upper bound on f (via basic calculus)
result <- rejection(f, M, g,rg, 10000)
print(length(result[result <= 0.8])/10000)
Typical output: 0.9016
You can also make a density histogram of result
and compare it to the theoretical beta distribution:
> hist(result,freq = FALSE)
> points(seq(0,1,0.01),dbeta(seq(0,1,0.01),4,3),type = "l")
The match is quite good: