Search code examples
rdplyrsummarize

Using summarize_all with colMeans and colVar to create pivoted table in R


I want to use summarize_all on the following data and create my desired output, but I was curious how to do this the tidy way using some combination of mutate and summarize I think? Any help appreciated!!

dummy <- tibble(
  a = 1:10,
  b = 100:109,
  c = 1000:1009
)

Desired Output

tibble(
  Mean = colMeans(dummy[1:3]),
  Variance = colVars(as.matrix(dummy[1:3])),
  CV = Variance/Mean
)
    Mean Variance      CV
   <dbl>    <dbl>   <dbl>
1    5.5     9.17 1.67   
2  104.      9.17 0.0877 
3 1004.      9.17 0.00913

Solution

  • It would be easier to reshape to 'long' format and then do it once after grouping by 'name'

    library(dplyr)
    library(tidyr)
    pivot_longer(dummy, cols = everything()) %>% 
      group_by(name) %>% 
      summarise(Mean = mean(value), Variance = var(value), CV = Variance/Mean) %>%
      select(-name)
    # A tibble: 3 x 3
    #    Mean Variance      CV
    #   <dbl>    <dbl>   <dbl>
    #1    5.5     9.17 1.67   
    #2  104.      9.17 0.0877 
    #3 1004.      9.17 0.00913
    

    Or either use summarise_all or summarise/across, but the output would be a single row, then do the reshaping

    dummy %>% 
       summarise(across(everything(), list(Mean = mean,
             Variance = var, CV = ~ mean(.)/var(.)))) %>%
       pivot_longer(everything()) %>% 
       separate(name, into = c('name', 'name2')) %>% 
       pivot_wider(names_from = name2, values_from = value)