How to count the number of unique labels belonging to one particular column with respect to timestamp by interval of x minutes?

My dataset is like this:

Dataset

Let me explain my data frame. I have two column named "timeStamp" and "label"."label" column contains the unique number of values occurred with respect to "timeStamp" column.

I am able to find the number of occurrences of unique variables in label column throughout the entire time using aggregate and count function available in R.

But now I want to count the number of occurrences of unique variables in label column with respect to timestamp by an interval of 2 minutes.

To be precise, this is what I am looking for in my output:

OUTPUT

You can find the data frame here using dput in R.

x <- data.frame(timeStamp = c("20:12:14","20:12:14","20:13:02","20:13:02","20:13:55","20:13:55","20:14:14","20:14:14","20:14:25","20:14:26","20:14:26","20:14:26","20:15:26","20:15:28","20:15:36","20:15:37","20:16:41","20:16:49","20:17:20","20:17:21"), label = c("003_T04_Ward Login","003_T04_Ward Login","002_T05_SearchPatient","002_T05_SearchPatient","003_T04_Ward Login","003_T04_Ward Login","002_T05_SearchPatient","002_T05_SearchPatient","001_T09_Submit Payment","001_T09_Submit Payment","001_T09_Submit Payment","001_T09_Submit Payment","002_T05_SearchPatient","002_T05_SearchPatient","003_T04_Ward Login","003_T04_Ward Login","002_T05_SearchPatient","002_T05_SearchPatient","003_T04_Ward Login","003_T04_Ward Login"
))
dput(x)

Solution

Here is a tidyverse solution:

# Create 2 min breakpoints by which we group times
hm <- function(x) as.POSIXct(x, format = "%H:%M")
breaks <- seq(min(hm(x$timeStamp)), max(hm(x$timeStamp)) + 120, by = '2 min');

library(tidyverse);
x %>%
    mutate(
        timeStamp = cut(hm(timeStamp), breaks = breaks)) %>%
    count(timeStamp, label) %>%
    spread(label, n)
## A tibble: 3 x 4
#  timeStamp           `001_T09_Submit Pa… `002_T05_SearchPat… `003_T04_Ward Lo…
#  <fct>                             <int>               <int>             <int>
#1 2018-04-13 20:12:00                  NA                   2                 4
#2 2018-04-13 20:14:00                   4                   4                 2
#3 2018-04-13 20:16:00                  NA                   2                 2

Explanation: We create 2 min breakpoints by which we cut the hour+minute component of timeStamp; then count by 2 min-grouped times and label, and spread from long to wide.

Sample data

x <- data.frame(
    timeStamp = c("20:12:14","20:12:14","20:13:02","20:13:02","20:13:55","20:13:55","20:14:14","20:14:14","20:14:25","20:14:26","20:14:26","20:14:26","20:15:26","20:15:28","20:15:36","20:15:37","20:16:41","20:16:49","20:17:20","20:17:21"), 
    label = c("003_T04_Ward Login","003_T04_Ward Login","002_T05_SearchPatient","002_T05_SearchPatient","003_T04_Ward Login","003_T04_Ward Login","002_T05_SearchPatient","002_T05_SearchPatient","001_T09_Submit Payment","001_T09_Submit Payment","001_T09_Submit Payment","001_T09_Submit Payment","002_T05_SearchPatient","002_T05_SearchPatient","003_T04_Ward Login","003_T04_Ward Login","002_T05_SearchPatient","002_T05_SearchPatient","003_T04_Ward Login","003_T04_Ward Login" ))