Search code examples
rloopsmatrixmatrix-multiplicationhamming-distance

loop different rows from dataframe A and apply them to the same individual rows in dataframe B


As I am working on a Q-Matrix question and needing to calculate hamming distance, I am having problem of different rows (as vectors) from dataframe A to the same row in dataframe B.

Specifically, df2 is a 6x3 dataframe and df1 is a 3x3 dataframe.

    df1 <- data.frame(Q1= c(1,0,1), Q2= c(0,0,0), Q3= c(1,1,0))
    df2 <- data.frame(A =c(1,1,0,1,1,0), B =c(0,0,0,1,1,0), C =c(0,1,0,1,1,1))

I need to apply all 6 rows from df2 as vectors into every row in df1 sequentially in order calculate the hamming distance of all 6 vector - 1 vector pair (it should output 3 hamming distance values only). For instance, for the 1st row in df1 which is (1,0,1) should be paired with all 6 rows in df2 including (1,0,0),(1,0,1),(0,0,0),(1,1,1),(1,1,1),(0,0,1)and produce 1 hamming distance value. Following the same logic, this similar process should give 3 hamming distance values.

I have tried a few times with double for() loop but with no success. In fact, I don't even know how we print out a row from a dataframe as a vector that could do other calculation with other vector. Can anyone please help?


Solution

  • This should work:

    for(i in 1:nrow(df1)){
      print(sum(apply(df2,1,function(x) x!=df1[i,])))
      }
    
    [1] 6
    [1] 8
    [1] 8