I have created a randomised, multivariate dataset similar to the below:
library(JWileymisc)
librarY(MASS)
library(dplyr)
V <- matrix(c(1,0.2,0.7,
0.2,1,0.5,
0.7,0.5,1)
,3,3)
sigma <- c(60,30,45)
mu <- c(25,10,15)
Sigma <-cor2cov(V,sigma)
data <-data.frame(mvrnorm(n=5,mu,Sigma,3,3))
data <- rename(data,outcome=X1,time=X2,exposure=X3)
data$exposure <- if_else(data$exposure>15,2,1)
I'm then wanting to use this randomised dataset to create many multiple simulated datasets. Is there an easy way to do this using a loop? I've so far tried something of the following:
NSIM <- 10 #Number of data sets to simulate
set.seed(3465)
simulated_data <- rep(0, NSIM)
for (m in 1:NSIM) {
simulated_data[m] <- data.frame(mvrnorm(n=5,mu,Sigma,3,3))
}
However, this doesn't really give me what I'm looking for and struggling to perform the rename/if_else components from the above. Any help would be most appreciated!
Here I am using purrr::map_dfr
to bind all the dataframes produced by the simulation (which are still identifiable and splittable by SN
). You can perform the common operations such as rename
and mutate(exposure, ...)
on the merged dataframe. Eventually, you can split them using e.g. group_by(SN)
followed by group_split()
library(purrr)
library(dplyr)
NSIM <- 10 #Number of data sets to simulate
set.seed(3465)
map_dfr(1:NSIM,
~data.frame(mvrnorm(n=5,mu,Sigma,3,3)),
.id = "SN") |>
rename(outcome=X1,time=X2,exposure=X3) |>
mutate(exposure = if_else(exposure>15,2,1)) |>
sample_n(10)
##> + SN outcome time exposure
##> 1 2 -35.3326059 3.2585097 1
##> 2 9 25.7304365 7.7347147 2
##> 3 7 68.3215424 -1.9466048 1
##> 4 2 77.4440558 61.0617621 2
##> 5 6 -46.3029760 -11.2115067 1
##> 6 9 92.7289232 52.0273595 2
##> 7 10 0.3859393 53.2966179 2
##> 8 10 -17.2009480 -29.6117604 1
##> 9 5 91.5425904 48.2142412 2
##> 10 3 73.6991481 0.8617149 2