Finding the mean time (independent of date) that an event occurs in R

I have a list of date-times corresponding to events that occurred over multiple days, and I am hoping to find the mean time that different categories of events occurred, independent of date (i.e., the mean time for events falling into category A was 11:04:42). I have a data frame that looks similar to the following:

df <- data.frame(category = c("A", "A", "B", "A", "C", "C", "B", "D", "A", "D", "D", "C"),
                    times = ymd_hms(c("2021-09-12 21:34:22", "2021-09-13 15:42:37",
                            "2021-09-16 22:36:50", "2021-09-24 09:41:00",
                            "2021-09-20 12:14:30", "2021-09-15 16:40:39",
                            "2021-09-15 09:16:39", "2021-09-14 15:50:47",
                            "2021-09-24 18:10:00", "2021-09-21 17:30:00",
                            "2021-09-14 17:43:53", "2021-09-23 19:00:00")))

I would like to find the mean time for all events in category A, but when I call something like mean(times), the output is a date and a time, whereas I would just like a time, independent of what day each event occurred.

As an example, I have tried summarizing the data frame, like so:

summary_times <- df %>%
  group_by(category) %>%
  summarize(avg_time = mean(times))

The result is "2021-09-18 06:20:06 UTC", which is not what I would like—I'm interested in generalizing to any given day, so I'm hoping for a time that does not take the dates of the individual events into account.

I have also tried taking the individual means of the hours, minutes, and seconds, and then taking the means of those individually, but I have not been successful with that, either. My first attempt looked like this:

summary_times <- df %>% 
  group_by(category) %>%
  summarize(avg_time = paste(mean(hour(times)), ":",
                             mean(minute(times)), ":",
                             mean(second(times))))

This gave me a "time" (just as a character object, which is fine with me; this is just being displayed in a table), but each of the hour, minute, and seconds had decimal remainders. This problem led me to try this next iteration:

summary_times <- df %>% 
  group_by(category) %>%
  summarize(avg_time = paste(sum(hour(times)) %/% n(), ":",
                             sum(minute(times)) %/% n() + (sum(hour(median_datetime)) %% n())*60, ":",
                             sum(second(times)) %/% n() + (sum(minute(median_datetime)) %% n())*60))

I no longer got decimal remainders on each component of the time; however, some of the components were greater than they could possibly be (e.g., a time of "15:247:130").

Any assistance in how to find this mean time in the day of events—either by pointing in the direction of a function that can perform this task, or by investigating the taking-the-mean-of-the-individual-components option—would be greatly appreciated!

Solution

An option is to convert to ITime and then get the mean

library(data.table)
library(dplyr)
df %>%
    group_by(category) %>%
    summarise(avg_time = mean(as.ITime(times)))

-output

# A tibble: 4 × 2
  category avg_time
  <chr>    <ITime> 
1 A        16:16:59
2 B        15:56:44
3 C        15:58:23
4 D        17:01:33

Or another option is to change the 'date' part to a standardized single date, and then take the mean, format to return only the 'time' part

df %>% 
   group_by(category) %>% 
   summarise(times = format(mean(as.POSIXct(format(times, 
          '2021-09-01 %H:%M:%S'))), '%H:%M:%S'))
# A tibble: 4 × 2
  category times   
  <chr>    <chr>   
1 A        16:16:59
2 B        15:56:44
3 C        15:58:23
4 D        17:01:33

Or do this in base R

transform(aggregate(times ~ category, 
     data = transform(df, 
          times= as.POSIXct(format(times, '2021-09-01 %H:%M:%S'))), mean),
     times = format(times, '%H:%M:%S'))

-output

 category    times
1        A 16:16:59
2        B 15:56:44
3        C 15:58:23
4        D 17:01:33