I am trying to scale data from multiple subjects onto the same time-scale. The current data files have 3 months of data for each subject, but the time-stamps for each event for each subject reflect different begin-end dates.
df$ID <- c(1, 1, 1, 1, 1, 2, 2, 2, 2)
df$Time <- c(2:34:00, 2:55:13, 5:23:23, 7:23:04, 9:18:18, 3:22:12, 4:23:02; 5:23:22, 9:30:02)
df$Date <- c(7/13/16, 7/13/16, 7/13/16, 7/14/16, 7/14/16, 1/02/14, 1/02/14, 1/03/14, 1/05/14)
df$widgets <-(4, 6, 9, 18, 3, 3, 7, 9, 12)
I want to change the df to have a common time scale so that I have a date index that allows me to keep the same format like below:
df$ScaleDate <- c(1,1,1,2,2,1,1,2,4) #time scale is within-ID
First, for reference, here's the data I used:
df <- data.frame(ID = c(1, 1, 1, 1, 1, 2, 2, 2, 2),
Time = c("2:34:00", "2:55:13", "5:23:23", "7:23:04", "9:18:18", "3:22:12", "4:23:02", "5:23:22", "9:30:02"),
Date = c("7/13/16", "7/13/16", "7/13/16", "7/14/16", "7/14/16", "1/02/14", "1/02/14", "1/03/14", "1/05/14"),
widgets = c(4, 6, 9, 18, 3, 3, 7, 9, 12))
We can use used dplyr
syntax (not mandatory, but looks very legible and simple) with lubridate
functions and perform this operation very simply:
library(dplyr)
library(lubridate)
df %>%
mutate(Date = mdy(Date)) %>%
group_by(ID) %>%
mutate(ScaleDate = as.numeric((Date - Date[1]) + 1))
mdy
converts the values to Date objects. We can then do operations with them. If the dates are equal, it will result in "0 days". Hence, we add 1 and covert it to numeric to get the indexes.