Search code examples
rdplyrfrequencyxtabsproportions

Trying to get frequnecy counts and percent by group of each column in data frame in R


I have data that look like this:

   pat# sex race    group   bmi
    1   F   Black   1   4
    2   M   Asian   2   8
    3   M   Asian   3   19
    4   M   Asian   1   35
    5   F   Black   2   12
    6   F   Black   3   33
    7   M   White   1   2
    8   F   Black   2   35
    9   M   Asian   3   6
    10  F   Black   1   13
    11  F   Black   2   18
    12  F   Asian   3   1
    13  M   White   1   36
    14  F   Asian   2   25
    15  M   White   3   6
    16  M   White   1   20
    17  F   Black   2   3
    18  M   Asian   3   23
    19  F   Black   1   26
    20  F   Asian   2   13
    21  M   White   3   21
    22  M   White   1   16
    23  F   Black   2   29
    24  F   Black   3   19
    25  M   Asian   1   17
    26  M   Asian   2   22
    27  F   Black   3   26

I would like to get the frequency of each variable and percent by group of each variable, like this:

        n           1   2   3
sex M   frequency   %   %   %
    F   frequency   %   %   %

next variable:

                n          1    2   3
race    White   frequency   %   %   %
        Asian   frequency   %   %   %
        Black   frequency   %   %   %

There are a lot of variables so I would rather not list each one. I've tried to use R's vector feature (df[2:30]) using xtabs() and dplyr package but am not getting it to work. Which package or function doesn't matter but would like to make it flexible enough for future data that uses different column names and have different dimensions. Any advice is greatly appreciated!!


Solution

  • I was able to do this using the table() function and tigerstats package. The main problem I was having was R will treat a SAS dataset differently than a CSV dataset. Night and day!