I want to extend the solution of this post where @AnandaMahto gave a very elegant solution to my problem.
For this new function, I'd like that if there are several times the same species in the same house, it will count only one observation. One house with two cats
and one rat
does not create two observations between cat
and rat
but only one (As shown below)
In this example, there are two rats
in the house number 4. As already said, I do not want to consider two observations between rat
and cat
and between spider
and rat
but only one observation between rat
and cat
and one observation between spider
and rat
houses = c(1,1,2,3,4,4,4,4,5,6,5)
animals = c('cat','dog','cat','dog','rat', 'cat', 'spider', 'rat', 'cat', 'cat', 'rat')
@AnandaMahto's solution would return this:
dog rat spider
cat 1 3 1
dog 0 0
rat 2
But I would like to get this:
dog rat spider
cat 1 2 1
dog 0 0
rat 1
Make all values > 0
from table
equal to "1" before using crossprod
:
(table(houses, animals) > 0) *1
# animals
# houses cat dog rat spider
# 1 1 1 0 0
# 2 1 0 0 0
# 3 0 1 0 0
# 4 1 0 1 1
# 5 1 0 1 0
# 6 1 0 0 0
out <- crossprod((table(houses, animals) > 0) *1)
out[lower.tri(out, diag=TRUE)] <- NA
as.table(out)
# animals
# animals cat dog rat spider
# cat 1 2 1
# dog 0 0
# rat 1
# spider
To get to the desired output, since we know that the first column and the last row will be empty, and since you already figured out on your own that as.table
would take care of not printing the NA
values, continuing from above, you can do:
out <- as.table(out[-nrow(out), -1])
out
# animals
# animals dog rat spider
# cat 1 2 1
# dog 0 0
# rat 1