How to use reshape2's dcast to select only one of several observations(values)

I have the following dataset

> dataset2
   ID ATCcode       date
1   1   N06AA 2001-01-01
2   1   N06AB 2001-04-01
3   1   N06AB 2001-03-01
4   1   N06AB 2001-02-01
5   1   N06AC 2001-01-01
6   2   N06AA 2001-01-01
7   2   N06AA 2001-02-01
8   2   N06AA 2001-03-01
9   3   N06AB 2001-01-01
10  4   N06AA 2001-02-01
11  4   N06AB 2001-03-01

It's in long format and I'd like it to be in wide format. However, I only want the earliest date for each ATCcode - and not any of the later dates. Thus I'd like to end up here:

> datasetLong
  ID      N06AA      N06AB      N06AC
1  1 2001-01-01 2001-02-01 2001-01-01
2  2 2001-01-01       <NA>       <NA>
3  3       <NA> 2001-01-01       <NA>
4  4 2001-02-01 2001-03-01       <NA>

(This is just a sample of the real dataset, it has more variation in the values and more observations than this).

I've managed to cast the dataset somewhat, but not in the manner which I want to:

dataset3 <- reshape2::dcast(dataset2, ID ~ ATCcode)

gives me the length of each vector/list:

> dataset3
  ID N06AA N06AB N06AC
1  1     1     3     1
2  2     3     0     0
3  3     0     1     0
4  4     1     1     0

Instead of the length, I'd like just one value, and that value should be the smallest value (or, the earliest date).

I've found a similar question asked on StackOverflow, but I was unable to use that in any way without getting various errors. I have not used melt in the above attempt, is that maybe necessary? Any help is appreciated.

Solution

This answer uses tidyverse methods.

One way would be would be to select minimum date from each ID and ATCcode and convert the data to wide format.

library(dplyr)

df %>%
  mutate(date = as.Date(date)) %>%
  group_by(ID, ATCcode) %>%
  slice(which.min(date)) %>%
  tidyr::pivot_wider(names_from = ATCcode, values_from = date)

#     ID N06AA      N06AB      N06AC     
#  <int> <date>     <date>     <date>    
#1     1 2001-01-01 2001-02-01 2001-01-01
#2     2 2001-01-01 NA         NA        
#3     3 NA         2001-01-01 NA        
#4     4 2001-02-01 2001-03-01 NA

data

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L, 
4L), ATCcode = structure(c(1L, 2L, 2L, 2L, 3L, 1L, 1L, 1L, 2L, 
1L, 2L), .Label = c("N06AA", "N06AB", "N06AC"), class = "factor"), 
date = structure(c(1L, 4L, 3L, 2L, 1L, 1L, 2L, 3L, 1L, 2L, 
3L), .Label = c("2001-01-01", "2001-02-01", "2001-03-01", 
"2001-04-01"), class = "factor")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))