How to convert 60 seconds data to 1 minute data? in R

my data structure is here:

str(data_df)
'data.frame':   62630 obs. of  16 variables:
 $ No            : int  0 1 2 3 4 5 6 7 8 9 ...
 $ UTC           : POSIXct, format: "2022-06-09 00:08:51" "2022-06-09 00:08:52" ...
 $ Temperature[C]: num  20 20 20 20 20.1 ...
 $ Humidity[%]   : num  57.4 56.7 56 55.3 54.7 ...
 $ TVOC[ppb]     : int  0 0 0 0 0 0 0 0 0 0 ...
 $ eCO2[ppm]     : int  400 400 400 400 400 400 400 400 400 400 ...
 $ Raw_H2        : int  12306 12345 12374 12390 12403 12419 12432 12439 12448 12453 ...
 $ Raw_Ethanol   : int  18520 18651 18764 18849 18921 18998 19058 19114 19155 19195 ...
 $ Pressure[hPa] : num  940 940 940 940 940 ...
 $ PM1.0         : num  0 0 0 0 0 0 0 0 0 0.9 ...
 $ PM2.5         : num  0 0 0 0 0 0 0 0 0 3.78 ...
 $ NC0.5         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ NC1.0         : num  0 0 0 0 0 ...
 $ NC2.5         : num  0 0 0 0 0 0 0 0 0 2.78 ...
 $ CNT           : int  0 1 2 3 4 5 6 7 8 9 ...
 $ Fire_Alarm    : int  0 0 0 0 0 0 0 0 0 0 ...

The records observed by seconds in 6/8, 9, 10, 13.

I want to make 60 records(60 seconds) to 1 record(1 minute)

for example) 2022-06-09 00:08:00 ~ 2022-06-09 00:08:59 to 1 record with

making 60 records to one mean record for every variable

like mean(Temperature), mean(Humidity)... and so on.

What should I do? It is not fully continuous records.

Solution

Create an auxiliary variable for the minutes with format and aggregate by it. Here is a dplyr solution.

# make test data
set.seed(2022)
start <- Sys.time() - lubridate::days(2)
end <- Sys.time()
UTC <- seq(start, end, by = "1 sec")
data_df <- data.frame(
  No = seq_along(UTC) - 1L,
  UTC,
  A = rnorm(length(UTC)),
  B = runif(length(UTC)),
  X = sample(4, length(UTC), TRUE)
)

suppressPackageStartupMessages(library(dplyr))

data_df %>%
  mutate(UTC_mins = format(UTC, "%Y-%m-%d %H-%M")) %>%
  group_by(UTC_mins) %>%
  summarise(across(A:X, ~ mean(.x, na.rm = TRUE)))
#> # A tibble: 2,881 × 4
#>    UTC_mins               A     B     X
#>    <chr>              <dbl> <dbl> <dbl>
#>  1 2022-10-24 18-15 -0.113  0.482  2.04
#>  2 2022-10-24 18-16  0.166  0.510  2.58
#>  3 2022-10-24 18-17  0.118  0.553  2.6 
#>  4 2022-10-24 18-18 -0.364  0.438  2.43
#>  5 2022-10-24 18-19  0.0428 0.554  2.65
#>  6 2022-10-24 18-20 -0.183  0.492  2.47
#>  7 2022-10-24 18-21 -0.0670 0.487  2.47
#>  8 2022-10-24 18-22  0.0950 0.537  2.65
#>  9 2022-10-24 18-23 -0.0489 0.541  2.22
#> 10 2022-10-24 18-24  0.0241 0.492  2.65
#> # … with 2,871 more rows

^{Created on 2022-10-26 with reprex v2.0.2}

And a base R solution.

UTC_mins <- format(data_df$UTC, "%Y-%m-%d %H-%M")
agg <- aggregate(. ~ UTC_mins, data_df[-(1:2)], mean, na.rm = TRUE)
rm(UTC_mins)

head(agg)
#>           UTC_mins           A         B        X
#> 1 2022-10-24 18-19 -0.05475728 0.4931293 2.095238
#> 2 2022-10-24 18-20  0.24564784 0.5239101 2.700000
#> 3 2022-10-24 18-21 -0.19414037 0.5169572 2.500000
#> 4 2022-10-24 18-22 -0.16612046 0.4793936 2.566667
#> 5 2022-10-24 18-23  0.03426193 0.5246618 2.616667
#> 6 2022-10-24 18-24 -0.19578060 0.5014733 2.416667

^{Created on 2022-10-26 with reprex v2.0.2}