Search code examples
rrandomsample

Generate random numbers in each row between 1 and a particular value in a column


I am stuck for quite a while now and really interested in how to achieve this. I have a data frame and I want to add another column with random numbers between 1 and the number that is in that row under the column Amount. How can I do this? This is what I have now:

dataframe$newColumn <- sample(1:30, nrow(dataframe), replace = T)

but I don't want it to go from 1 to 30, but rather have the number in Amount column as the maximum.


Solution

  • Using base R you can use vapply() to iterate over dataframe$Amount, calling sample() for each value in dataframe$Amount

    dataframe$newColumn <- vapply(dataframe$Amount, sample, integer(1), size = 1)
    

    This is equivalent to calling

    sample(dataframe$Amount[i], size = 1)
    

    for each row i in dataframe. Note that if the first argument to sample() is a single value n, sample(n) expands this to 1:n.

    sample(5)
    ## [1] 5 2 4 1 3
    

    We use integer(1) in vapply() for the FUN.VALUE argument to, in essence, declare that the output of each of the above iterations will be an integer. You can achieve the same result with map_int() from the purrr package:

    dataframe$newColumn <- purrr::map_int(dataframe$Amount, sample, size = 1)