Search code examples
rcsvevent-log

How to specify cells in csv file with certain conditions in R


I have an event log in csv format and want to specify cells with certain condition in the table. The table looks like below.

Case.ID | Activity | Timestamp | Resource
----------------------------------------------
   0    |Take order| 00:12:04  |  Waiter
----------------------------------------------
   0    |Take order| 00:18:02  |
----------------------------------------------
   1    |Bring food| 00:47:23  | Cook helper
----------------------------------------------
   1    |Bring food| 00:52:41  |

Start of the activity has value in Resource column, but end of the activity has none in there.

I want to make a column duration, which is the difference from end timestamp to start timestamp but not sure how to deal with it.


Solution

  • Using the data frame given by @timfaber do:

    aggregate(x = list(duration = as.POSIXct(df$Timestamp,format = "%H:%M:%S")),
              by = list(Case.ID = df$Case.ID),
              FUN = diff)
    

    This gives:

      Case.ID  duration
    1       0 5.966667 
    2       1 5.300000