Search code examples
rpivotreshape2confidence-interval

Reframing output of confidence intervals to combine mean, upper and lower values into one cell


I use the code below to calculate the mean, upper and lower confidence intervals of multiple variables at once.

library(gmodels)
library(purrr)
dfci <- df %>% 
  group_by(group) %>% 
  dplyr::summarize(across(everything(),
  .fns = list(mean = ~ mean(.x, na.rm = TRUE, trim = 4),
  ci = ~ ci(.x, confidence = 0.95, alpha = 0.05, na.rm = T))))
#dfci <- dfci[-(13:16),] # remove additional rows
write.csv(dfci, file="dfci.csv")

Sample data:

Group| A_pre  |    A_post |  B_pre |  B_post 

0       20          21        20        23
1       30          10        19        11
2       10          53        30        34
1       22          32        25        20
2       34          40        32        30
0       30          50        NA        40
0       39          40        19        20
1       40          NA        20        20
2       50          10        20        10
0       34          23        30        10

As I have over 50 "pre" and "post" variables i.e., >100 variables, is it possible to combine the outputs from the three desired cells (mean, lower and upper ci) into one so I am not manually combining all of them?

I tried pivoting into long after the ci calculations but doesn't work:


library(reshape2)

dfci <- df %>%
  group_by(group) %>%
  summarize(across(everything(),
                   .fns = list(mean = ~ mean(.x, na.rm = TRUE, trim = 4),
                               ci = ~ ci(.x, confidence = 0.95, alpha = 0.05, na.rm = TRUE))))

dfci <- melt(dfci, id.vars = "group")
dfci <- dcast(dfci, group + variable ~ variable)

write.csv(dfci, file = "dfi.csv", row.names = FALSE)

Solution

  • Unfortunately the earlier answers did not work as they repeated the same ci throughout.

    This code does the job:

    library(dplyr)
    
    dfci <- df %>%
      group_by(group) %>%
      summarise(across(everything(), list(
        mean = ~ mean(., na.rm = TRUE, trim = 4),
        ci = ~ { # OWN CI FUNCTION 
          se <- sqrt(var(., na.rm = TRUE) / sum(!is.na(.)))
          mean_val <- mean(., na.rm = TRUE)
          lower <- mean_val - qt(0.975, df = sum(!is.na(.))) * se
          upper <- mean_val + qt(0.975, df = sum(!is.na(.))) * se
          paste0("[", round(lower, 2), ", ", round(upper, 2), "]")
        }
      ), .names = "{.col}_{.fn}")) %>%
      ungroup()