Search code examples
rdataframelistnormal-distributionsapply

How to add a new normally distributed data column based on the exsiting paramter columns in a dataframe in r?


I want to generate a new column(for every single cell is a list of data)) called 'val', which is a normally distributed data, based on the exsiting columns in R. However it returns error as follows: Error in $<-.data.frame(*tmp*, val, value = c(0.771570544918682, 0.203569004424465, : replacement has 6 rows, data has 3. Need help. I doubt have used the wrong function.

Any hint or tip would be apprciated. Thanks.

df <- data.frame(n1 = c(5, 10, 20), 
n2 = c(10, 10, 20), 
m1 = c(0.01, 1.2, 1.1), 
m2 = c(0.1,0.2,0.5), 
sd1 = c(1,1,1), 
sd2 = c(2,1,2))

df$val <- sapply(df, function(x) 
c(rnorm(df$n1, df$m1, df$sd1), rnorm(df$n2, df$m2, df$sd2)))

Solution

  • One approach is to use purrr::pmap:

    library(dplyr)
    library(purrr)
    
    df %>% 
      mutate(val=pmap(list(n1,n2,m1,m2,sd1,sd2),
                      function(n1,n2,m1,m2,sd1,sd2) c(rnorm(n1,m1,sd1),rnorm(n2,m2,sd2))))