Search code examples
rcomparematchmismatch

How to use grep or any other method to compare different no of row in two data frame and get the match and mismatch?


Below are my two dataframe:

ABData1 <- data.frame(id=c(11,12,13,14,15),
                      a = c(1,2,3,4,5))
ABData2 <- data.frame(id=c(11,12,13,14),
                      b = c(1,4,3,4))

how to compare these two dataframe for matching rows and mismatch rows

if 1st row of ABData1 of a is matching with 1st row of ABData2 of b is matching then show as match and else show as mismatch and then goes to 2nd row....all the comparison will rowwise.

i have tried below code which is working fine for one data frame but its trowing error because of different rows in two data frames.

ABData <- data.frame(a = c(1,2,2,1,1),
                     b = c(1,2,1,1,2))

    match<- ABData %>% rowwise() %>% filter(grepl(a,b, fixed = TRUE))
    
    mismatch<- ABData %>% rowwise() %>% filter(!grepl(a,b))

I am expecting below output

Expected match Output:

id    a     expected    b
11    1     1           1
13    3     3           3
14    4     4           4
Expected mismatch output:

id    a     expected    b
12    2     2           4
15    NA    NA          5

Thanks in advance.


Solution

  • You can use this:

    ABData1 <- data.frame(a = c(1,2,3,4,5))
    ABData2 <- data.frame(b = c(1,4,3,4))
    
    equLength <- function(x, y) {
      if (length(x)>length(y)) length(y) <- length(x) else length(x) <- length(y)
      data.frame(a=x, b=y)
    }
    
    ABData <- equLength(ABData1$a, ABData2$b)
    

    ... and then use your working code for one dataframe.

    library("dplyr")
    resultMatch <- ABData %>% rowwise() %>% filter(grepl(a,b, fixed = TRUE))
    resultMismatch <- ABData %>% rowwise() %>% filter(!grepl(a,b))
    

    For the extended question:

    library("dplyr")
    
    ABData1 <- data.frame(id=c(11,12,13,14,15),  a = c(1,2,3,4,5))
    ABData2 <- data.frame(id=c(11,12,13,14),  b = c(1,4,3,4))
    
    equLength <- function(x, y) {
      if (length(x)>length(y)) length(y) <- length(x) else length(x) <- length(y)
      data.frame(a=x, b=y)
    }
    
    if (nrow(ABData1)>nrow(ABData2)) ABData <- data.frame(ABData1, b=equLength(ABData1$a, ABData2$b)$b) else
      ABData <- data.frame(ABData2, a=equLength(ABData1$a, ABData2$b)$a)
      
    resultMatch <- ABData %>% rowwise() %>% filter(grepl(a,b, fixed = TRUE))
    resultMismatch <- ABData %>% rowwise() %>% filter(!grepl(a,b))