R: Calculating the Number of Phone Calls Every Hour

I have a dataset that looks like the following:

Call No.  Arrival Time  Call Length (in hrs) ...
================================================
1         0.01          0.061 
2         0.08          0.05
3         0.10                                   (Busy/Unanswered)
4         0.15          0.42
...
10        1.03          0.36
11        1.09          0.72
...

I want to count the number of phone calls each hour (e.g. number of successful phone calls from arrival times [0, 1), [1, 2), [2, 3), etc.

There are some empty values in the call length column, indicating that the phone line was busy, so the call went unanswered. I basically want to count the nonempty occurrences of the call length and group them by hour by summing them. How can I do this using dataframe operations in R?

Solution

Chiming in with a base R solution

dat <- data.frame(`Call No.` = c(1,2,3,4,10,11), 
           `Arrival Time` = c(0.01,0.08,0.10,0.15,1.03,1.09),
           `Call Length (in hrs)` = c(0.61, 0.05, NA, 0.42, 0.36, 0.72), 
           check.names = F) # to keep the spaces
# filter out NAs
dat2 <- dat[complete.cases(dat),]
# add an hour variable
dat2$hour <- floor(dat2$`Arrival Time`)

# for fun, create a function that takes in a df
count_and_sum <- function(df){
  return(data.frame(hour = df$hour[1], # assumes we will pass it dfs with 1 hour only
             answered_calls =   nrow(df),
             total_call_time_hrs = sum(df$`Call Length (in hrs)`)))
}

# use split to separate the data into a list of data.frames by hour 
# added a step but might be better to do in one row for memory
splitted <- split(dat2, dat2$hour, drop= T) 
# use sapply to apply our function to each element of the splitted list
# and transpose to make the output the right orientation
t(sapply(splitted, count_and_sum))
#  hour answered_calls total_call_time_hrs
#0 0    3              1.08               
#1 1    2              1.08