Search code examples
rset-difference

Find the difference b/w two column elements in r


How can I put the diff elements b/w factor_Nov and factor_Jan in a new column called diff

 df=data.frame(id=c("1","2","3"),
                     factor_Nov=c("A|B|C","E","F|H|G"),
                     factor_Jan=c("B|H|E","E","X|Y|Z"))

the output should be

df=data.frame(id=c("1","2","3"),
                 factor_Nov=c("A|B|C","E","F|H|G"),
                 factor_Jan=c("B|H|E","E","X|Y|Z"),
                diff=c("A|C|H|E",NA,"X|Y|Z|F|H|G"))

I tried setdiff but that wasn't working


Solution

  • An option is to split the columns with strsplit, using delimiter as |, then use Map to get the elements that are not intersect, paste them with collapse = "|"

    df$diff <- unlist(Map(function(x, y) paste(setdiff(union(x, y), 
       intersect(x, y)), collapse="|"),
       strsplit(as.character(df$factor_Nov), "|", fixed = TRUE),
       strsplit(as.character(df$factor_Jan), "|", fixed = TRUE)))