Search code examples
rsimulationprobabilityrepeat

R: repeat experiment for 1000 times, saving variable in every repeat


I have simulation in which i create 2 groups of 100 patients in each group. Dataframe contain patients, type of intervention (R for reference and T for test) and recovery status (1 or 0). Then I need to make an experiment: make confidence interval for probabilities difference. I need to repeat experiment for 1000 times (every experiment creat a new sample group of patients) and saving results (df_prop2) in every repeat. At the end I should get a dataframe (df_prop2) with 1000 rows (1000 obs. of 7 variables). I dont understand why using my code I'm getting dataframe with 1 row and 1007 columns. Please help

This is my code:

library(dplyr)
library(tidyr)

set.seed(123)
p_R <- 0.1 #true probability for reference
p_T <- 0.2 #true probabiluty for Test
sample_size <- 100 #I have 100 patients in 2 groups

n_rep <- 1000 #number of repeats

df_all_repeats <- data.frame(
  n_exp = rep(1:n_rep, each = 2*sample_size),
  arm = rep(c('R', 'T'), each = sample_size),
  patient_ID = rep(1:sample_size, n_rep))

df_all_repeats <- df_all_repeats %>%
  group_by(n_exp) %>%
  mutate(recovery_status = c(sample(c(1,0), sample_size, replace = TRUE, prob = c(p_R, 1 - p_R)),
                      sample(c(1,0), sample_size, replace = TRUE, prob = c(p_T, 1 - p_T))))

#Experiment
df_prop2 <- df_all_repeats %>% 
  group_by(arm) %>% 
  group_by(n_exp) %>% 
  dplyr::summarise(x = sum(recovery_status),
                   n = n()) %>% 
  ungroup() %>% 
  dplyr::summarise(X = list(x), N = list(n)) %>% 
  rowwise() %>% 
  mutate(tst = list(broom::tidy(prop.test(X, N)))) %>% 
  unnest(tst) 
  
print(df_prop2)


Solution

  • Here's a solution which also simplifies the data generation:

    library(dplyr)
    library(tidyr)
    library(broom)
    
    p_R <- 0.1 # true probability for reference
    p_T <- 0.2 # true probabiluty for Test
    sample_size <- 100 # number of patients
    n_rep <- 1000 # number of repeats
    
    set.seed(123)
    df_all_repeats <- expand.grid(
      patient_id = 1:sample_size,
      arm = c("R", "T"),
      n_exp = 1:n_rep,
      stringsAsFactors = FALSE) %>% 
      as_tibble() %>% ## data is sorted n_exp, arm, patient_id
      mutate(recovery_status = c(sample(1:0, sample_size, replace = TRUE,
                                        prob = c(p_R, 1 - p_R)),
                                 sample(1:0, sample_size, replace = TRUE, 
                                        prob = c(p_T, 1 - p_T))),
             .by = n_exp)
    
    df_all_repeats %>%
      group_by(n_exp, arm) %>%
      summarise(x = sum(recovery_status), 
                n = n()) %>%
      summarise(broom::tidy(prop.test(x, n)))
    
    # # A tibble: 1,000 × 10
    #    n_exp estimate1 estimate2 statistic  p.value parameter conf.low conf.high method                                                           alternative
    #    <int>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl>    <dbl>     <dbl> <chr>                                                            <chr>      
    #  1     1      0.07      0.19    5.35   0.0207           1   -0.222   -0.0183 2-sample test for equality of proportions with continuity corre… two.sided  
    #  2     2      0.1       0.19    2.58   0.108            1   -0.197    0.0168 2-sample test for equality of proportions with continuity corre… two.sided  
    #  3     3      0.09      0.27    9.79   0.00175          1   -0.294   -0.0665 2-sample test for equality of proportions with continuity corre… two.sided  
    #  4     4      0.08      0.21    5.81   0.0160           1   -0.236   -0.0241 2-sample test for equality of proportions with continuity corre… two.sided  
    #  5     5      0.07      0.21    7.02   0.00807          1   -0.244   -0.0358 2-sample test for equality of proportions with continuity corre… two.sided  
    #  6     6      0.03      0.18   10.4    0.00124          1   -0.242   -0.0576 2-sample test for equality of proportions with continuity corre… two.sided  
    #  7     7      0.08      0.16    2.32   0.128            1   -0.179    0.0194 2-sample test for equality of proportions with continuity corre… two.sided  
    #  8     8      0.12      0.15    0.171  0.679            1   -0.135    0.0746 2-sample test for equality of proportions with continuity corre… two.sided  
    #  9     9      0.08      0.28   12.2    0.000471         1   -0.313   -0.0872 2-sample test for equality of proportions with continuity corre… two.sided  
    # 10    10      0.11      0.13    0.0473 0.828            1   -0.120    0.0800 2-sample test for equality of proportions with continuity corre… two.sided  
    # # ℹ 990 more rows
    # # ℹ Use `print(n = ...)` to see more rows