I face a Problem in R which I can't handle myself.
I have a data frame that looks like this with more variables und cases:
ID Var1 Var2 Var3 Var4
1 1 0 1 1
2 0 0 0 0
3 1 1 1 1
4 1 1 0 1
5 1 0 1 0
I like to have — similar to a correlation matrix — a matrix that shows the frequency that a pair of variables have the same value — for example the value "1". The resulting matrix for the df above should then be like.
Var1 Var2 Var3 Var4
Var1 2 3 3
Var2 1 2
Var3 2
Var4
Perhaps you can help. Thank you in advance.
First create a evaluation data matrix that tests for your value, here 1.
e <- d[-1] == 1 ## value to test
Then use outer
to compare the columns crosswise with a FUN
ction that sum
s how often there are two TRUE
s summing up to 2
. From the res
ult you apparently want to remove the lower.tri
including the diag
onal.
FUN <- Vectorize(function(i, j) sum(e[,i] + e[,j] == 2))
(res <- t(outer(1:ncol(e), 1:ncol(e), FUN)))
res[lower.tri(res, diag=1)] <- NA
res
# [,1] [,2] [,3] [,4]
# [1,] NA 2 3 3
# [2,] NA NA 1 2
# [3,] NA NA NA 2
# [4,] NA NA NA NA