I have one data frame like this. The id
of each line is unique and the type
defines the group of the id
.
id type
a a1
b a1
c a2
d a3
e a4
f a4
I want to make a matrix like below. The value would be 1 if the two id
belong to the same type
, otherwise 0.
a b c d e f
a 1 1 0 0 0 0
b 1 1 0 0 0 0
c 0 0 1 0 0 0
d 0 0 0 1 0 0
e 0 0 0 0 1 1
f 0 0 0 0 1 1
The data frame is large (over 70 thousands line), and I do not know how to do this efficiently in R. Any suggestions would be appreciated.
Here is a base R solution, and I think you can use the following code
M <- crossprod(t(table(df)))
or
M <- crossprod(table(rev(df)))
such that
> M
id
id a b c d e f
a 1 1 0 0 0 0
b 1 1 0 0 0 0
c 0 0 1 0 0 0
d 0 0 0 1 0 0
e 0 0 0 0 1 1
f 0 0 0 0 1 1
DATA
df <- structure(list(id = c("a", "b", "c", "d", "e", "f"), type = c("a1",
"a1", "a2", "a3", "a4", "a4")), class = "data.frame", row.names = c(NA,
-6L))