As I am working on a Q-Matrix question and needing to calculate hamming distance, I am having problem of different rows (as vectors) from dataframe A
to the same row in dataframe B
.
Specifically, df2
is a 6x3 dataframe and df1
is a 3x3 dataframe.
df1 <- data.frame(Q1= c(1,0,1), Q2= c(0,0,0), Q3= c(1,1,0))
df2 <- data.frame(A =c(1,1,0,1,1,0), B =c(0,0,0,1,1,0), C =c(0,1,0,1,1,1))
I need to apply all 6 rows from df2
as vectors into every row in df1
sequentially in order calculate the hamming distance of all 6 vector - 1 vector pair (it should output 3 hamming distance values only). For instance, for the 1st row in df1
which is (1,0,1)
should be paired with all 6 rows in df2
including (1,0,0),(1,0,1),(0,0,0),(1,1,1),(1,1,1),(0,0,1)
and produce 1 hamming distance value. Following the same logic, this similar process should give 3 hamming distance values.
I have tried a few times with double for() loop
but with no success. In fact, I don't even know how we print out a row from a dataframe
as a vector that could do other calculation with other vector. Can anyone please help?
This should work:
for(i in 1:nrow(df1)){
print(sum(apply(df2,1,function(x) x!=df1[i,])))
}
[1] 6
[1] 8
[1] 8