I have data that look like this:
pat# sex race group bmi
1 F Black 1 4
2 M Asian 2 8
3 M Asian 3 19
4 M Asian 1 35
5 F Black 2 12
6 F Black 3 33
7 M White 1 2
8 F Black 2 35
9 M Asian 3 6
10 F Black 1 13
11 F Black 2 18
12 F Asian 3 1
13 M White 1 36
14 F Asian 2 25
15 M White 3 6
16 M White 1 20
17 F Black 2 3
18 M Asian 3 23
19 F Black 1 26
20 F Asian 2 13
21 M White 3 21
22 M White 1 16
23 F Black 2 29
24 F Black 3 19
25 M Asian 1 17
26 M Asian 2 22
27 F Black 3 26
I would like to get the frequency of each variable and percent by group of each variable, like this:
n 1 2 3
sex M frequency % % %
F frequency % % %
next variable:
n 1 2 3
race White frequency % % %
Asian frequency % % %
Black frequency % % %
There are a lot of variables so I would rather not list each one. I've tried to use R's vector feature (df[2:30]
) using xtabs()
and dplyr
package but am not getting it to work. Which package or function doesn't matter but would like to make it flexible enough for future data that uses different column names and have different dimensions. Any advice is greatly appreciated!!
I was able to do this using the table()
function and tigerstats
package. The main problem I was having was R will treat a SAS dataset differently than a CSV dataset. Night and day!