Search code examples
rdplyrpivot-tablereshapetranspose

R dplyr pivot wider with duplicates and generate variable names


How can I go from

df<-data.frame(id=c("A", "B", "B"), res=c("one", "two", "three"))
df

to

df.output<-data.frame(id=c("A", "B"), res1=c("one", "two"), res2=c(NA, "three"))
df.output

with dplyr?

I do not know a priori the number of duplicates in id (in this example B has 2 occurrences), so the number of resX variables in the output data frame has to be generated on the fly.


Solution

  • You just need to create a row identifier, which you can do with dplyr and then use tidyr::pivot_wider() to generate all your resX variables.

    library(dplyr)
    library(tidyr)
    
    df %>%
      group_by(id) %>%
      mutate(
        no = row_number()
      ) %>%
      ungroup() %>%
      pivot_wider(
        id,
        names_from = no,
        names_prefix = "res",
        values_from = res
      )
    #> # A tibble: 2 × 3
    #>   id    res1  res2 
    #>   <chr> <chr> <chr>
    #> 1 A     one   <NA> 
    #> 2 B     two   three