Search code examples
rsummary

How do I only report selected summary statistics in a table that lists variables as rows using R?


I have a dataset and I need to create a simple table with the number of observations, means, and standard deviations of all the variables (columns). I can't find a way to get only the required 3 summary statistics. Everything I tried keeps giving me min, max, median, 1st and 3rd quartiles, etc. The table should look something like this (with a title):

Table 1: Table Title
_______________________________________
Variables   Observations  Mean  Std.Dev
_______________________________________
Age            30          24      2
...             .          .       .
...             .          .       .
_______________________________________

The summary () does not work because it gives too many other summary statistics. I have done this:

sapply(dataset, function(x) list(means=mean(x,na.rm=TRUE), sds=sd(x,na.rm=TRUE)))

But how do I form the table from this? And is there a better way to do this than using "sapply"?


Solution

  • sapply does return the values that you want but it is not properly structured.

    Using mtcars data as an example :

    #Get the required statistics and convert the data into dataframe
    summ_data <- data.frame(t(sapply(mtcars, function(x) 
                         list(means = mean(x,na.rm=TRUE), sds = sd(x,na.rm=TRUE)))))
    #Change rownames to new column
    summ_data$variables <- rownames(summ_data)
    #Remove rownames
    rownames(summ_data) <- NULL
    #Make variable column as 1st column
    cbind(summ_data[ncol(summ_data)], summ_data[-ncol(summ_data)])
    

    Another way would be using dplyr functions :

    library(dplyr)
    
    mtcars %>%
      summarise(across(.fns = list(means = mean, sds = sd), 
                       .names = '{col}_{fn}')) %>%
      tidyr::pivot_longer(cols = everything(), 
                   names_to = c('variable', '.value'), 
                   names_sep = '_')
    
    # A tibble: 11 x 3
    #   variable   means     sds
    #   <chr>      <dbl>   <dbl>
    # 1 mpg       20.1     6.03 
    # 2 cyl        6.19    1.79 
    # 3 disp     231.    124.   
    # 4 hp       147.     68.6  
    # 5 drat       3.60    0.535
    # 6 wt         3.22    0.978
    # 7 qsec      17.8     1.79 
    # 8 vs         0.438   0.504
    # 9 am         0.406   0.499
    #10 gear       3.69    0.738
    #11 carb       2.81    1.62