I need some help creating a matrix. I have a large dataset with multiple groups. Each group is sorted into cases and non cases.
For ex.
Group | Cases | Noncases |
---|---|---|
GroupA | 4 | 7 |
GroupB | 9 | 4 |
GroupC | 10 | 3 |
I want to create a matrix which will compare one group to the sum of the other groups.
For instance:
Disease Category | GroupA | NotGroupA |
---|---|---|
Case | 4 | 19 |
Noncase | 7 | 7 |
The goal is to set up a matrix which will allow me to run a chisquare test and/or a Fisher's exact test (depending on sample size).
I have tried the following code to extrapolate values from my dataframe into a matrix:
GroupA <- as.table(matrix(c(df[1,3], df[1,4], (sum(df$group_cases)-df$group_cases[1])), (sum(df$Noncases)-df$Noncases[1])), nrow=2, ncol=2,
dimnames=list(Group= c("A", "Other"),
Case = c(1, 0)))
However, I get the following error:
Warning message:
In matrix(c(df[1, 3], df[1, 4], (sum(df$group_cases) - :
data length [3] is not a sub-multiple or multiple of the number of rows [329]
It outputs a 329 row list instead of a 2 by 2 matrix.
Because I have many groups, I want R to calculate the values for me when constructing the matrix. I don't want to calculate the "NotGroup_" column separately, as that makes room for human error.
How would you all recommend constructing this matrix, and is it possible to have R calculate the sums of columns/subtract values while creating a matrix?
Thank you for your help!
Set up example:
dd <- data.frame(Group = LETTERS[1:3], Cases = c(4, 9, 10),
Noncases = c(7,4,3))
Function:
mktab <- function(focal, data) {
## subset rows according to whether $Group == focal or not
## subset cols according to "Cases"/"Noncases"
## sum() the not-focal elements
matrix(c(data[data$Group==focal, "Cases"],
sum(data[data$Group!=focal, "Cases"]),
data[data$Group==focal, "Noncases"],
sum(data[data$Group!=focal, "Noncases"])
),
nrow = 2,
byrow=TRUE,
dimnames = list(c("Case", "Noncase"),
c(focal, paste0("not_", focal)))
)
}
mktab("A", dd)
Results:
A not_A
Case 4 19
Noncase 7 7