Averaging data at specific hours across days in R

Update(2021/5/13, 11:55AM): The data is not reproducible. Thank you @G. Grothendieck for giving me a comment.

(This is my first post in Stackoverflow, so please let me know if there's any basic mistakes I made in this question)

Hi,

I have a hourly-measured data of air quality over multiple days in R, and I would like to calculate the average air quality at specific time periods.

Here's a subset of my reproducible data. It is in xts format.

# Make a structure of data
dput(Air_sample[1:6,1:1])

# Create a data from the structure above.
Air <- structure(
 c(2.6, 2, 2.2, 2.2, 1.6, 1.2),
 class = c("xts", "zoo"),
 index = structure(
  c(
   1078959600,
   1078963200,
   1079046000,
   1079049600,
   1079132400,
   1079136000
  ),
  tzone = "",
  tclass = c("POSIXct",
             "POSIXt")
 ),
 .Dim = c(6L, 1L),
 .Dimnames = list(NULL, c("True.CO")))

> Air
                    True.CO
2004-03-10 18:00:00     2.6
2004-03-10 19:00:00     2.0
2004-03-11 18:00:00     2.2
2004-03-11 19:00:00     2.2
2004-03-12 18:00:00     1.6
2004-03-12 19:00:00     1.2

I want to calculate average CO at specific time (ex. 6PM) from multiple days. So the result would be something like below.

Air_average <- data.frame("Time" = c("18:00","19:00"), "Average CO" = c(2.1333,1.8))

> Air_average
   Time Average.CO
1 18:00     2.1333
2 19:00     1.8000

I tried different function via Googling such as "period.apply", "subset", "window", etc. But none of them seems to work.

Is there any way to do this?

Thank you.

Solution

You can use something like dplyr to do grouping operation, and lubridate to deal with dates. lubridate has the hour function which return only the hours. I first convert your data into a data frame:

library(lubridate)
library(dplyr)
library(xts)

Air <- data.frame(Air) %>% 
  add_rownames(var = "time")

  time                True.CO
  <chr>                 <dbl>
1 2004-03-11 00:00:00     2.6
2 2004-03-11 01:00:00     2  
3 2004-03-12 00:00:00     2.2
4 2004-03-12 01:00:00     2.2
5 2004-03-13 00:00:00     1.6
6 2004-03-13 01:00:00     1.2

Because of my timezone, the hours are not the same than yours, but the code will be the same.

Air %>%
  group_by(hour(time))%>%
  summarise(mean(True.CO))

# A tibble: 2 x 2
  `hour(time)` `mean(True.CO)`
         <int>           <dbl>
1            0            2.13
2            1            1.8