Search code examples
rif-statementstatistics-bootstrapglmmr-formula

Adding an ifelse statement to a bootstrap function to set times model fails to converge equal to NA


rather complex question. I am running a GLMM on an experiment with repeated measurements on individuals. Due to relatively low N, and to increase transparency on the model itself, I designed a bootstrap formula to sample by individual with replacement equal to the total number of individuals included. This is the formula:

hormonedf <- as.data.table(hormonedf)

sexsteroidbootstrap <- function(df, n) {
  samp <- sample(unique(df$ID), n, replace = TRUE)
  setkey(df, "ID")
  df_sample <- df[J(samp), allow.cartesian = TRUE]
  model <- glmer(formula = Behaviors ~ 1 + Age + Treatment + Age:Treatment + (1|ID), data = df_sample, family=poisson)
  summary <- summary(model)
  Intercept <- summary$coefficients[1,1]
  Age_Coef <- summary$coefficients[2,1]
  EE_Coef <- summary$coefficeints[3,1]
  KT_Coef <- summary$coefficeints[4,1]
  EEAge_Coef <- summary$coefficeints[5,1]
  KTAge_Coef <- summary$coefficeints[6,1]
  pAge <- summary$coefficients[2,4]
  pEE <- summary$coefficients[3,4]
  pKT <- summary$coefficients[4,4]
  pEEAge <- summary$coefficients[5,4]
  pKTAge <- summary$coefficients[6,4]
  coefs <- rbind(Intercept, Age_Coef, EE_Coef, KT_Coef, EEAge_Coef, KTAge_Coef, pAge, pEE, pKT, pEEAge, pKTAge)
  return(coefs)
}

#actual bootstrap 
bootstrapresults <- as.data.frame(replicate(1000, sexsteroidbootstrap(hormonedf, 29)))
bootstrapresults <- t(bootstrapresults)
bootstrapresults <- as.data.frame(bootstrapresults)

However, there were certain days across the experiment where I could not include data from certain days from a particular individual. This isn't an issue across the full dataset, but in a bootstrap, there are a small number of random times where the IDs sampled just happen to be the incomplete ones, leading to a small number of 'failed to converge' errors...typically about 10-15 per set of 1000 replications. This is fine, but when I go to save my coefficients, due to the warnings the formula cannot save the output for the entire bootstrap.

My desire: to include an ifelse statement (or any other solution) within the formula to either skip samples where I get that error, or set all of those coefficients equal to NA. Any help would be appreciated!

The data:

    structure(list(Treatment = c("EE", "EE", "EE", "EE", "EE", "EE", 
"EE", "EE", "KT", "KT", "KT", "KT", "DMSO", "DMSO", "DMSO", "DMSO", 
"DMSO", "KT", "KT", "EE", "EE", "DMSO", "DMSO", "KT", "KT", "KT", 
"KT", "KT", "KT", "KT", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", 
"DMSO", "DMSO", "EE", "EE", "EE", "EE", "EE", "EE", "EE", "DMSO", 
"DMSO", "DMSO", "DMSO", "DMSO", "KT", "KT", "KT", "KT", "KT", 
"KT", "KT", "KT", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", 
"DMSO", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", "EE", "EE", "EE", 
"EE", "EE", "EE", "EE", "KT", "KT", "KT", "KT", "KT", "KT", "EE", 
"EE", "EE", "EE", "EE", "EE", "EE", "KT", "KT", "KT", "KT", "KT", 
"KT", "KT", "EE", "EE", "EE", "EE", "EE", "EE", "DMSO", "DMSO", 
"DMSO", "DMSO", "DMSO", "DMSO", "EE", "EE", "EE", "EE", "EE", 
"EE", "EE", "KT", "KT", "KT", "KT", "KT", "KT", "KT", "EE", "EE", 
"EE", "EE", "EE", "EE", "EE", "KT", "KT", "KT", "KT", "KT", "KT", 
"KT", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", "DMSO", 
"EE", "EE", "EE", "EE", "EE", "EE", "EE"), Age = c(22, 24, 30, 
15, 17, 17, 20, 24, 17, 20, 24, 27, 15, 17, 20, 24, 27, 20, 20, 
17, 20, 17, 20, 15, 17, 20, 22, 24, 27, 30, 15, 17, 20, 22, 24, 
27, 30, 15, 17, 20, 22, 24, 27, 30, 15, 17, 22, 27, 30, 17, 15, 
17, 20, 22, 24, 27, 30, 17, 20, 22, 27, 30, 15, 17, 20, 22, 24, 
27, 30, 15, 17, 20, 22, 24, 27, 30, 15, 17, 22, 24, 27, 30, 15, 
17, 20, 22, 24, 27, 30, 15, 17, 20, 22, 24, 27, 30, 15, 17, 20, 
22, 24, 27, 15, 17, 20, 22, 24, 27, 15, 17, 20, 22, 24, 27, 30, 
15, 17, 20, 22, 24, 27, 30, 15, 17, 20, 22, 24, 27, 30, 15, 17, 
20, 22, 24, 27, 30, 15, 17, 20, 22, 24, 27, 30, 15, 17, 20, 22, 
24, 27, 30), Behaviors = c(30, 85, 191, 3, 0, 2, 48, 86, 1, 34, 
50, 0, 0, 7, 76, 89, 20, 47, 59, 0, 81, 2, 13, 0, 4, 169, 28, 
17, 8, 224, 1, 1, 136, 95, 24, 7, 130, 0, 1, 147, 44, 174, 169, 
197, 0, 2, 9, 10, 103, 0, 0, 0, 32, 93, 115, 323, 225, 39, 12, 
17, 1, 109, 1, 0, 0, 36, 13, 68, 19, 0, 0, 1, 11, 37, 0, 5, 0, 
0, 4, 123, 3, 117, 0, 15, 19, 3, 4, 118, 96, 0, 0, 12, 1, 2, 
228, 203, 2, 1, 3, 3, 0, 84, 0, 0, 4, 2, 2, 248, 3, 2, NA, NA, 
NA, 21, 116, 1, 1, 50, 160, 73, 21, 74, NA, 22, 27, 217, 7, 80, 
124, 0, 1, 41, 68, 143, 161, 29, 1, 0, 117, 211, 1, NA, NA, 2, 
55, 171, 263, 10, 10, 96), ID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 
3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 7L, 8L, 8L, 9L, 9L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 
11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 
14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 
17L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 18L, 
18L, 19L, 19L, 19L, 19L, 19L, 19L, 20L, 20L, 20L, 20L, 20L, 20L, 
20L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 
22L, 23L, 23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 24L, 
24L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 26L, 26L, 26L, 26L, 26L, 
26L, 26L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L, 28L, 28L, 
28L, 28L, 28L, 29L, 29L, 29L, 29L, 29L, 29L, 29L)), row.names = c(NA, 
-150L), class = c("tbl_df", "tbl", "data.frame"))

Solution

  • Consider tryCatch to return NULL on errors which will fall out in outer as.data.frame calls:

    sexsteroidbootstrap <- function(df, n) {
      samp <- sample(unique(df$ID), n, replace = TRUE)
      setkey(df, "ID")
      df_sample <- df[J(samp), allow.cartesian = TRUE]
      
      tryCatch(
        {
          model <- glmer(
            formula = Behaviors ~ 1 + Age + Treatment + Age:Treatment + (1|ID), 
            data = df_sample, 
            family = poisson
          )
          summary <- summary(model)
          
          rbind(
            Intercept = summary$coefficients[1,1],
            Age_Coef = summary$coefficients[2,1],
            EE_Coef = summary$coefficeints[3,1],
            KT_Coef = summary$coefficeints[4,1],
            EEAge_Coef = summary$coefficeints[5,1],
            KTAge_Coef = summary$coefficeints[6,1],
            pAge = summary$coefficients[2,4],
            pEE = summary$coefficients[3,4],
            pKT = summary$coefficients[4,4],
            pEEAge = summary$coefficients[5,4],
            pKTAge = summary$coefficients[6,4]
          )
        }, error = \(e) {
          print(e)
          return(NULL)
        }
      )
    }