Search code examples
rmultiple-columnssummarize

How to combine scattered values into one row using R


I have a dataset like this and i want the desired dataset as below.

    dat <- read.table(text="Id  Bug    Drug1 Drug2 
A   Staph     NA  S
A   Staph  S  NA
A   E.coli    NA  S
A   E.coli  S  NA", header=TRUE)


dat.desired <- read.table(text="Id  Bug    Drug1 Drug2 
A   Staph     S  S
A   E.coli    S  S", header=TRUE)

Solution

  • The following should work using the dplyr library:

    library(dplyr)
    
    dat %>% 
      group_by(Id, Bug) %>% 
      summarise(across(Drug1:Drug2, ~.x[!is.na(.x)][1]))
    

    Output:

    #  Id    Bug    Drug1 Drug2
    # <chr> <chr>  <chr> <chr>
    # 1 A     E.coli S     S    
    # 2 A     Staph  S     S