Search code examples
rdplyrmutate

How to create new random column variables based on column values in R


I am trying to run a simulation, but would like to find an efficient approach to do this.

Sample data:

month <- 1:12
a <- rep(10, 12)
dat1 <- data.frame(month, a) 
sim_dat <- do.call(rbind, replicate(50, dat1, simplify = FALSE)) %>% 
  mutate(sim_index = rep(1:50, each = nrow(dat1)))

What I want to do is create a new variable for each month and generate a random value for that month. My current approach is to use dplyr, see below:

sim_dat1 <- sim_dat %>% 
  group_by(sim_index) %>% 
  mutate(mnth1 = ifelse(month == 1, a + rnorm(n()), NA),
         mnth2 = ifelse(month == 2, a + rnorm(n()), NA),
         mnth3 = ifelse(month == 3, a + rnorm(n()), NA),
         mnth4 = ifelse(month == 4, a + rnorm(n()), NA),
         mnth5 = ifelse(month == 5, a + rnorm(n()), NA),
         mnth6 = ifelse(month == 6, a + rnorm(n()), NA),
         mnth7 = ifelse(month == 7, a + rnorm(n()), NA),
         mnth8 = ifelse(month == 8, a + rnorm(n()), NA),
         mnth9 = ifelse(month == 9, a + rnorm(n()), NA),
         mnth10 = ifelse(month == 10, a + rnorm(n()), NA),
         mnth11 = ifelse(month == 11, a + rnorm(n()), NA),
         mnth12 = ifelse(month == 12, a + rnorm(n()), NA))

It works but it is inefficient. Any suggestions on how to improve this code?


Solution

  • library(dplyr); library(tidyr)
    sim_dat %>%
      mutate(col = paste0("mnth", month), num = a + rnorm(n())) %>%
      pivot_wider(names_from = col, values_from = num)
    

    Result

    # A tibble: 600 × 15
       month     a sim_index mnth1 mnth2 mnth3 mnth4 mnth5 mnth6 mnth7 mnth8 mnth9 mnth10 mnth11 mnth12
       <int> <dbl>     <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
     1     1    10         1  9.46 NA     NA    NA   NA    NA     NA    NA    NA    NA        NA     NA
     2     2    10         1 NA     9.35  NA    NA   NA    NA     NA    NA    NA    NA        NA     NA
     3     3    10         1 NA    NA     10.5  NA   NA    NA     NA    NA    NA    NA        NA     NA
     4     4    10         1 NA    NA     NA    10.9 NA    NA     NA    NA    NA    NA        NA     NA
     5     5    10         1 NA    NA     NA    NA    8.87 NA     NA    NA    NA    NA        NA     NA
     6     6    10         1 NA    NA     NA    NA   NA     8.12  NA    NA    NA    NA        NA     NA
     7     7    10         1 NA    NA     NA    NA   NA    NA     10.4  NA    NA    NA        NA     NA
     8     8    10         1 NA    NA     NA    NA   NA    NA     NA    10.4  NA    NA        NA     NA
     9     9    10         1 NA    NA     NA    NA   NA    NA     NA    NA    12.7  NA        NA     NA
    10    10    10         1 NA    NA     NA    NA   NA    NA     NA    NA    NA     9.12     NA     NA
    # ℹ 590 more rows