Search code examples
rdataframereshapetidyrspread

spread() where non-unique values are put into a new column


I have some data that looks like this (code for input at the end):

#>           artist          album year  source                     id
#> 1        Beatles  Sgt. Pepper's 1967  amazon             B0025KVLTM
#> 2        Beatles  Sgt. Pepper's 1967 spotify 6QaVfG1pHYl1z15ZxkvVDW
#> 3        Beatles  Sgt. Pepper's 1967  amazon             B06WGVMLJY
#> 4 Rolling Stones Sticky Fingers 1971 spotify 29m6DinzdaD0OPqWKGyMdz

I would like to fix the 'id' column (which includes ids from multiple sources, as shown in the 'source' column.

This should be a straightfoward spread(), but the complication is that sometimes we have a duplicate id from the exact same source: see row 1 and row 3 above.

Is there an easy way to do the spread() and put the duplicate id in a new column?

My desired outcome would be:


#>           artist          album year  source  amazon_id amazon_id_2
#> 1        Beatles  Sgt. Pepper's 1967  amazon B0025KVLTM  B06WGVMLJY
#> 2 Rolling Stones Sticky Fingers 1971 spotify       <NA>        <NA>
#>                  spotify
#> 1 6QaVfG1pHYl1z15ZxkvVDW
#> 2 29m6DinzdaD0OPqWKGyMdz

The code below is to input the sample data:

df <- data.frame(stringsAsFactors=FALSE,
      artist = c("Beatles", "Beatles", "Beatles", "Rolling Stones"),
       album = c("Sgt. Pepper's", "Sgt. Pepper's", "Sgt. Pepper's",
                 "Sticky Fingers"),
        year = c(1967, 1967, 1967, 1971),
      source = c("amazon", "spotify", "amazon", "spotify"),
          id = c("B0025KVLTM", "6QaVfG1pHYl1z15ZxkvVDW", "B06WGVMLJY",
                 "29m6DinzdaD0OPqWKGyMdz")
)
df

Solution

  • Here is one approach.

    df %>% 
      group_by(artist,source) %>%  
      mutate(rownum = row_number()) %>% 
      unite(source, source, rownum, sep="_") %>% 
      spread(source,id)
    
    # A tibble: 2 x 6
    # Groups:   artist [2]
      artist         album           year amazon_1   amazon_2   spotify_1             
      <chr>          <chr>          <dbl> <chr>      <chr>      <chr>                 
    1 Beatles        Sgt. Pepper's   1967 B0025KVLTM B06WGVMLJY 6QaVfG1pHYl1z15ZxkvVDW
    2 Rolling Stones Sticky Fingers  1971 NA         NA         29m6DinzdaD0OPqWKGyMdz