Search code examples
rsamplecdf

How to get a sample of 10000 from cdf of a random variable in R?


I have the cdf:

F_X(x) = 0 for x<=10
     (x-10)^3/1000 for 10<x<20
    1 for x=>20

I need to generate a sample of 10,000 from X. how can I do so in R?

I'm extremely new to R, so would appreciate any help


Solution

  • Your cdf function can be written in R as:

    cdf <- function(x) (x - 10)^3 / 1000
    

    Which means we can plot it for the region [10, 20] like this:

    x <- seq(10, 20, 0.1)
    plot(x, cdf(x), type = "l")
    

    Effectively, what we want to do is generate a sample from the uniform distribution between 0 and 1, then imagine these numbers being on the y axis. We then want to "read off" the equivalent points on the x axis to generate a sample from X. To do this we just rearrange the equation to find its inverse:

    inverse_cdf <- function(x) 10 + (1000 * x)^(1/3)
    

    Which means our sample can be generated like this:

    X <- inverse_cdf(runif(10000))
    

    Now we can plot the empirical cdf of this sample with the theoretical cdf and ensure they match:

    plot(ecdf(X))
    lines(x, cdf(x), col = "red")
    

    This shows us that the emprical cdf of X matches the theoretical cdf, indicating that X is indeed sampled from the correct distribution.

    As a further demonstration, note the the pdf of X will be the first derivative of the cdf. It will therefore be 0 everywhere except between 10 and 20, where it will be:

    pdf <- function(x) 3*(x - 10)^2 / 1000
    

    So if we plot this over a density histogram of X we should get a close match:

    hist(X, freq = FALSE)
    x <- seq(10, 20, 0.1)
    lines(x, pdf(x), col = "red")
    

    enter image description here