I would like to summarise data as mean
and sd
, or median
and lower and upper quartile, depending on if they are normally distributed or not.
Using mtcars
as an example, this is how I am doing one variable at a time:
sum= mtcars%>%
group_by(am)%>%
summarise(MPG = paste0(mean(qsec), " (", sd(sec), ")")
I'd like to do something like this
norm = c("qsec", "drat", "hp", "mpg")
sum= mtcars%>%
group_by(am)%>%
summarise(across(where(. %in% norm), . = paste0(mean(.,na.rm = T), " (", sd(.,na.rm=T) , ")") )
)
and add the relevant line for median and quartiles.
Would also be happy with a for
loop solution and then ? rbind
.
I suppose you want to do something like this:
library("dplyr")
norm <- c("qsec", "drat", "hp", "mpg")
my_summary <- mtcars |>
group_by(am) |>
summarise(
across(
all_of(norm),
~ paste0(mean(.x, na.rm = TRUE), "(sd=", sd(.x, na.rm = TRUE), ")")
),
across(
!all_of(norm),
~ paste0(median(.x, na.rm = TRUE), "(", quantile(.x, 1/4), " - ", quantile(.x, 3/4), ")")
)
)
You can simply use all_of
to select the columns you want from norm
or negate it.