Search code examples
runique

how to remove common elements in a column in 5 data frame in R


I have 5 data frames:

a <- data.frame(ID = c("1", "2", "3", "4", "5"), peak = c("peak1", "peak2", "peak3", "peak4", "peak10"))
b <- data.frame(ID = c("1", "2", "3", "4"), peak = c("peak1","peak3", "peak20", "peak21"))
c <- data.frame(ID = c("1", "2", "3"), peak = c("peak1", "peak5", "peak3"))
d <- data.frame(ID = c("1", "2", "3", "4", "5", "6"),peak = c("peak1", "peak3", "peak7", "peak8", "peak11", "peak12"))
e <- data.frame(ID = c("1", "2", "3"), peak = c("peak1", "peak3",  "peak9"))

I would like to remove the common peaks across the data frames, with a desired outputs:

a <- data.frame(ID = c("1", "2", "3", "4", "5"), peak = c("peak2", "peak4", "peak10"))
b <- data.frame(ID = c("1", "2", "3", "4"), peak = c("peak20", "peak21"))
c <- data.frame(ID = c("1", "2", "3"), peak = c("peak5", ))
d <- data.frame(ID = c("1", "2", "3", "4", "5", "6"),peak = c(  "peak7", "peak8", "peak11", "peak12"))
e <- data.frame(ID = c("1", "2", "3"), peak = c(  "peak9"))

I know how to compare two data frames a[!(a$peak %in% b$peak),] but I'm struggling with 5.


Solution

  • Use the following approach :

    #Put the data in a list
    list_df <- dplyr::lst(a, b, c, d, e)
    #Get the common peak value
    common_peak <- Reduce(intersect, lapply(list_df, `[[`, 'peak'))
    common_peak
    #[1] "peak1" "peak3"
    
    #Remove the common peak value from all the dataframes
    result <- lapply(list_df, function(x) subset(x, !peak %in% common_peak))
    result
    
    #$a
    #  ID   peak
    #2  2  peak2
    #4  4  peak4
    #5  5 peak10
    
    #$b
    #  ID   peak
    #3  3 peak20
    #4  4 peak21
    
    #$c
    #  ID  peak
    #2  2 peak5
    
    #$d
    #  ID   peak
    #3  3  peak7
    #4  4  peak8
    #5  5 peak11
    #6  6 peak12
    
    #$e
    #  ID  peak
    #3  3 peak9
    
    #Update all the individual dataframes
    list2env(result, .GlobalEnv)