I am trying to create a table with preset dimensions and have R fill in the counts and percentages. This is for an R-markdown report.
Here is the code for my sample data.
#This is the most realistic data I could produce.
Maj <- rep("Major A", times=50)
set.seed(24601)
Race <- sample(c("Asian","Black", "Am Indian","Hawiian" ,"Hispanic","White","Two or More Races","Not Reported"),
prob=c(.01,.1,.01,.01,.02,.80,.05,.01),size=50, replace = T)
Sex <- sample(c("Female","Male"), prob=c(.98,.02),size=50,replace=T)
Enroll_MajorA <- cbind(Maj,Sex,Race)
I need the table to calculate a count and a percent whether or not a given Race and Sex combination exists in the data set. Here is what I need it to look like.
I have tried computing each value for the table individually and R-markdown gave me a "memory error". I have tried creating a count and percent table and combining them together but it does not give all of the Race/Sex combinations I need for the report. I am not sure where to go from here. Please Help!
You may use aggregate
. You can keep the matrix untouched, since you can use as.data.frame
, which coerces automatically into countable factors. The NROW
(capitals) doesn't discriminate between matrices and vectors.
m.agg <- do.call(data.frame,
aggregate(. ~ Sex + Race, as.data.frame(Enroll_MajorA), function(x)
c(count=as.integer(NROW(x)), share=NROW(x) / NROW(Enroll_MajorA))))
To get the complete set, we may merge with an expand.grid
, which we may want to clean up a little.
res <- merge(as.data.frame(m.agg), expand.grid(Sex=c("Female", "Male"),
Race=relevant.races), all=TRUE) # `relevant.races` below
res[, 3:4][is.na(res[, 3:4])] <- 0 # transform `NA` into 0 to be nice
res[order(res[, "Race"]), ] # order output
# Sex Race Maj.count Maj.share
# 1 Female Black 2 0.04
# 10 Male Black 0 0.00
# 2 Female Hawiian 1 0.02
# 3 Female Hispanic 1 0.02
# 11 Male Hispanic 0 0.00
# 4 Female Two or More Races 2 0.04
# 12 Male Two or More Races 0 0.00
# 5 Female White 44 0.88
# 13 Male White 0 0.00
# 6 Female Asian 0 0.00
# 14 Male Asian 0 0.00
# 7 Female Am. Indian 0 0.00
# 15 Male Am. Indian 0 0.00
# 8 Female Hawaiian 0 0.00
# 16 Male Hawaiian 0 0.00
# 9 Female Not Reported 0 0.00
# 17 Male Not Reported 0 0.00
relevant.races <- c("Asian","Black", "Am. Indian", "Hawaiian" , "Hispanic", "White",
"Two or More Races", "Not Reported")
Enroll_MajorA <- structure(c("Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "White", "White",
"White", "Hawiian", "White", "White", "White", "White", "White",
"White", "White", "White", "White", "Two or More Races", "White",
"White", "White", "White", "White", "White", "White", "Hispanic",
"White", "White", "White", "White", "White", "White", "Two or More Races",
"White", "White", "White", "White", "White", "White", "White",
"White", "Black", "White", "White", "Black", "White", "White",
"White", "White", "White", "White", "White", "White", "White"
), .Dim = c(50L, 3L), .Dimnames = list(NULL, c("Maj", "Sex",
"Race")))