Search code examples
rgroupinglapplyproportions

Get the frequency percentages for multiple variables in a dataset


I am writing a function to produce a frequency table using prop.table, and want to produce this for several categorical variables in the data set.

I am using datasets::mtcars for this example. I am looking to write the function with a group_by on the binary variable "am" in the dataset, so the output is stratified by am == 1 and am == 0.

For this code, is there a way to add a group_by statement?

summary(mtcars)

apply(mtcars[c("cyl", "gear", "carb")], 2, 
                    \(x) prop.table(table(x, useNA = "always"))*100)

Solution

  • You can use by() in base R:

    by(mtcars, mtcars$am, function(mt_am) 
      sapply(mt_am[c("cyl", "gear", "carb")], 
            function(mt_col) prop.table(table(mt_col, useNA = "always"))*100))
    
    #> mtcars$am: 0
    #> $cyl
    #> mt_col
    #>        4        6        8     <NA> 
    #> 15.78947 21.05263 63.15789  0.00000 
    #> 
    #> $gear
    #> mt_col
    #>        3        4     <NA> 
    #> 78.94737 21.05263  0.00000 
    #> 
    #> $carb
    #> mt_col
    #>        1        2        3        4     <NA> 
    #> 15.78947 31.57895 15.78947 36.84211  0.00000 
    #> 
    #> ------------------------------------------------------------ 
    #> mtcars$am: 1
    #> $cyl
    #> mt_col
    #>        4        6        8     <NA> 
    #> 61.53846 23.07692 15.38462  0.00000 
    #> 
    #> $gear
    #> mt_col
    #>        4        5     <NA> 
    #> 61.53846 38.46154  0.00000 
    #> 
    #> $carb
    #> mt_col
    #>         1         2         4         6         8      <NA> 
    #> 30.769231 30.769231 23.076923  7.692308  7.692308  0.000000