Search code examples
rdataframeconsolidation

How do I consolidate a dataframe with duplicate names in one column with different results in another column in R?


I have a large R data.frame where I have duplicate chemical names with different "Result" and "Use" values in a second column. I want to consolidate these so I have a single row per chemical, and all "Result" and "Use" values are comma separated in a single cell in each of their respective columns.

example df:

df1 <- data.frame(Name=c("reservatol","reservatol","reservatol","DPG"),
                  Result = c("naturally occurring", "antagonist", "synthetic", "rubber"),
                  Use = c("Pharma", "Pharma", "Drugs and Medication", "Tires"))
  Name              Result                     Use
1 reservatol        naturally occurring        Pharma
2 reservatol        antagonist                 Pharma
3 reservatol        synthetic                  Drugs and Medication
4 DPG               rubber                     Tires

Is there a way to make that look like this?

  Name           Result                                         Use
1 reservatol     naturally occurring, antagonist, synthetic     Pharma, Drugs and Medication
2 DPG            rubber                                         Tires

I tried using group_by("Name") %>% mutate(Use = paste0(Result, collapse=",") in hopes of concatenating Uses, but it didn't seem to do anything.


Solution

  • You were almost there - use summarise instead of mutate:

    library(dplyr)
    
    df1 %>%
      group_by(Name) %>%
      summarise(Result = paste(Result, collapse = ", "), Use = paste(Use, collapse = ", "))