I have the cdf:
F_X(x) = 0 for x<=10
(x-10)^3/1000 for 10<x<20
1 for x=>20
I need to generate a sample of 10,000 from X. how can I do so in R?
I'm extremely new to R, so would appreciate any help
Your cdf function can be written in R as:
cdf <- function(x) (x - 10)^3 / 1000
Which means we can plot it for the region [10, 20] like this:
x <- seq(10, 20, 0.1)
plot(x, cdf(x), type = "l")
Effectively, what we want to do is generate a sample from the uniform distribution between 0 and 1, then imagine these numbers being on the y axis. We then want to "read off" the equivalent points on the x axis to generate a sample from X
. To do this we just rearrange the equation to find its inverse:
inverse_cdf <- function(x) 10 + (1000 * x)^(1/3)
Which means our sample can be generated like this:
X <- inverse_cdf(runif(10000))
Now we can plot the empirical cdf of this sample with the theoretical cdf and ensure they match:
plot(ecdf(X))
lines(x, cdf(x), col = "red")
This shows us that the emprical cdf of X
matches the theoretical cdf, indicating that X
is indeed sampled from the correct distribution.
As a further demonstration, note the the pdf of X
will be the first derivative of the cdf. It will therefore be 0 everywhere except between 10 and 20, where it will be:
pdf <- function(x) 3*(x - 10)^2 / 1000
So if we plot this over a density histogram of X
we should get a close match:
hist(X, freq = FALSE)
x <- seq(10, 20, 0.1)
lines(x, pdf(x), col = "red")