Spreading data that is grouped by ID but having different observations

I have this data:

drugData <- data.frame(caseID=c(9, 9, 10, 11, 12, 12, 12, 12, 13, 45, 45, 225),
            Drug=c("Cocaine", "Cocaine", "DPT", "LSD", "Cocaine", "LSD", "Heroin","Heroin", "LSD", "DPT", "DPT", "Heroin"),
             County=c("A", "A", "B", "C", "D", "D", "D","D", "E", "F", "F", "G"),
             Date=c(2009, 2009, 2009, 2009, 2011, 2011, 2011, 2011, 2010, 2010, 2010, 2005))

"CaseID" rows make up a single case, which may have observations of all the same drug, or different types of drugs. I want this data to look like the following:

CaseID  Drug.1     Drug.2    Drug. 3   Drug.4    County   Date
9       Cocaine    Cocaine   NA        NA        A        2009
10      DPT        LSD       NA        NA        B        2009
11      LSD        NA        NA        NA        C        2009
12      Cocaine    LSD       Heroin    Heroin    D        2011
13      LSD        NA        NA        NA        E        2010
45      DPT        DPT       NA        NA        F        2010
225     Heroin     NA        NA        NA        G        2005

I've tried using dplyr spread function but can't seem to quite get this to work.

Solution

We can pivot to wide format after creating a sequence column based on 'caseID'

library(dplyr)
library(tidyr)
library(stringr)
library(data.table)
drugData %>%
   mutate(nm = str_c('Drug', rowid(caseID))) %>% 
   pivot_wider(names_from = nm, values_from = Drug) 
#A tibble: 7 x 7
#  caseID County  Date Drug1   Drug2   Drug3  Drug4 
#   <dbl> <fct>  <dbl> <fct>   <fct>   <fct>  <fct> 
#1      9 A       2009 Cocaine Cocaine <NA>   <NA>  
#2     10 B       2009 DPT     <NA>    <NA>   <NA>  
#3     11 C       2009 LSD     <NA>    <NA>   <NA>  
#4     12 D       2011 Cocaine LSD     Heroin Heroin
#5     13 E       2010 LSD     <NA>    <NA>   <NA>  
#6     45 F       2010 DPT     DPT     <NA>   <NA>  
#7    225 G       2005 Heroin  <NA>    <NA>   <NA>

Or with spread (spread is deprecated in place of pivot_wider

drugData %>%
   mutate(nm = str_c('Drug', rowid(caseID))) %>% 
   spread(nm, Drug)

Or using data.table

dcast(setDT(drugData), caseID + County  + Date ~
         paste0('Drug', rowid(caseID)), value.var = 'Drug')
#   caseID County Date   Drug1   Drug2  Drug3  Drug4
#1:      9      A 2009 Cocaine Cocaine   <NA>   <NA>
#2:     10      B 2009     DPT    <NA>   <NA>   <NA>
#3:     11      C 2009     LSD    <NA>   <NA>   <NA>
#4:     12      D 2011 Cocaine     LSD Heroin Heroin
#5:     13      E 2010     LSD    <NA>   <NA>   <NA>
#6:     45      F 2010     DPT     DPT   <NA>   <NA>
#7:    225      G 2005  Heroin    <NA>   <NA>   <NA>