Search code examples
rrecode

Recode dates to study day within subject


I have data in which subjects completed multiple ratings per day over 6-7 days. The number of ratings per day varies. The data set includes subject ID, date, and the ratings. I would like to create a new variable that recodes the dates for each subject into "study day" --- so 1 for first day of ratings, 2 for second day of ratings, etc.

For example, I would like to take this:

id  Date    Rating
1   10/20/2018  2
1   10/20/2018  3
1   10/20/2018  5
1   10/21/2018  1
1   10/21/2018  7
1   10/21/2018  9
1   10/22/2018  4
1   10/22/2018  5
1   10/22/2018  9
2   11/15/2018  1
2   11/15/2018  3
2   11/15/2018  4
2   11/16/2018  3
2   11/16/2018  1
2   11/17/2018  0
2   11/17/2018  2
2   11/17/2018  9

And end up with this:

id  Day Date    Rating
1   1   10/20/2018  2
1   1   10/20/2018  3
1   1   10/20/2018  5
1   2   10/21/2018  1
1   2   10/21/2018  7
1   2   10/21/2018  9
1   3   10/22/2018  4
1   3   10/22/2018  5
1   3   10/22/2018  9
2   1   11/15/2018  1
2   1   11/15/2018  3
2   1   11/15/2018  4
2   2   11/16/2018  3
2   2   11/16/2018  1
2   3   11/17/2018  0
2   3   11/17/2018  2
2   3   11/17/2018  9

I was going to look into setting up some kind of loop, but I thought it would be worth asking if there is a more efficient way to pull this off. Are there any functions that would allow me to automate this sort of thing? Thanks very much for any suggestions.


Solution

  • Since you want to reset the count after every id , makes this question a bit different.

    Using only base R, we can split the Date based on id and then create a count of each distinct group.

    df$Day <- unlist(sapply(split(df$Date, df$id), function(x) match(x,unique(x))))
    
    
    df
    #   id       Date Rating Day
    #1   1 10/20/2018      2   1
    #2   1 10/20/2018      3   1
    #3   1 10/20/2018      5   1
    #4   1 10/21/2018      1   2
    #5   1 10/21/2018      7   2
    #6   1 10/21/2018      9   2
    #7   1 10/22/2018      4   3
    #8   1 10/22/2018      5   3
    #9   1 10/22/2018      9   3
    #10  2 11/15/2018      1   1
    #11  2 11/15/2018      3   1
    #12  2 11/15/2018      4   1
    #13  2 11/16/2018      3   2
    #14  2 11/16/2018      1   2
    #15  2 11/17/2018      0   3
    #16  2 11/17/2018      2   3
    #17  2 11/17/2018      9   3
    

    I don't know how I missed this but thanks to @thelatemail who reminded that this is basically the same as

    library(dplyr)
    df %>%
      group_by(id) %>%
      mutate(Day = match(Date, unique(Date)))
    

    AND

    df$Day <- as.numeric(with(df, ave(Date, id, FUN = function(x) match(x, unique(x)))))