I have an event log in csv format and want to specify cells with certain condition in the table. The table looks like below.
Case.ID | Activity | Timestamp | Resource
----------------------------------------------
0 |Take order| 00:12:04 | Waiter
----------------------------------------------
0 |Take order| 00:18:02 |
----------------------------------------------
1 |Bring food| 00:47:23 | Cook helper
----------------------------------------------
1 |Bring food| 00:52:41 |
Start of the activity has value in Resource column, but end of the activity has none in there.
I want to make a column duration, which is the difference from end timestamp to start timestamp but not sure how to deal with it.
Using the data frame given by @timfaber do:
aggregate(x = list(duration = as.POSIXct(df$Timestamp,format = "%H:%M:%S")),
by = list(Case.ID = df$Case.ID),
FUN = diff)
This gives:
Case.ID duration
1 0 5.966667
2 1 5.300000