Search code examples
rstatisticssimulationuniform-distribution

Understanding code about inverse transform sampling R


I am trying to understand the following code about Inverse Transform Sampling (Discrete example)

    discrete.inv.transform.sample <- function( p.vec ) {
  U  <- runif(1)
  if(U <= p.vec[1]){
    return(1)
  }
  for(state in 2:length(p.vec)) {
    if(sum(p.vec[1:(state-1)]) < U && U <= sum(p.vec[1:state]) ) {
      return(state)
    }
  }
}

num.samples <- 1000
p.vec        <- c(0.1, 0.4, 0.2, 0.3)
names(p.vec) <- 1:4
samples     <- numeric(num.samples)
for(i in seq_len(num.samples) ) {
  samples[i] <- discrete.inv.transform.sample(p.vec)
}
barplot(p.vec, main='True Probability Mass Function')

My first question is in the function discrete.inv.transform.sample(p.vec) in the first part return(1), where is this 1 value when returned?

And in the second part, return(state) where is this state allocated?

  • Why is this line names(p.vec)<-1:4 for ?

  • What seq_len means ?

  • Why samples[i] is not used anymore in the code?

I think there should be a stand alone line sample

Could someone please explain?

Thank you in advance


Solution

  • It looks like you need to do some basic research on R and programming in general. Here are short answers to your simple questions, but please read on afterward for some broader advice.

    • Where is the 1 value when returned? Wherever it is assigned. Here, namely in samples[i] for whichever i that branch is reached.
    • Where is this state allocated? In the line for(state in 2:length(p.vec))
    • Why is this line names(p.vec)<-1:4 for? Good question. names()<- just assigns names to an object, and I'm not sure why in your context it's useful to have names that are equal to the vector indices, though I could imagine it to be so in some contexts.
    • What seq_len means? seq_len(x) creates an integer vector with all the numbers from 1 to x inclusive. See help("seq_len")
    • Why samples[i] is not used anymore in the code? Because it's only useful in the for loop.

    All this points to a bigger problem though: You don't understand the basics of R. We all started out there, but it means you need to read some basic info and work through some basic tutorials. RStudio provides some resources for learning here.