I am trying to calculate means each 3 observations in my dataframe which is based on 10 minutes data and I am trying to average it down to half an hour. My data looks like this:
Date Value
2017-09-20 09:19:59 96.510
2017-09-20 09:30:00 113.290
2017-09-20 09:40:00 128.370
2017-09-20 09:50:00 128.620
2017-09-20 10:00:00 94.080
2017-09-20 10:10:00 208.150
2017-09-20 10:20:00 178.820
2017-09-20 10:30:00 208.440
2017-09-20 10:40:00 285.490
2017-09-20 10:49:59 305.020
I first tried calculating the means with the function rollapply
from the zoo package library (zoo)
in the following way:
means <- rollapply(df, by=3, 3, FUN=mean)
However, I got 50 warnings saying:
In mean.default(data[posns], ...) : argument is not numeric or logical: returning NA
I checked my classes and the value(numeric) and Date is a factor. Then I tried to convert the Date (factor) to a date class by:
`df$Date <- as.Date(df, format = "%Y-%m-%d %H:%m:%s")` and
df$Date <- strptime(time,"%Y-%m-%d %H:%M:%S",tz="GMT") and still didn't work.
I also tried to calculate the means with aggregate and it still doesn't work.
library(chron)
aggregate(chron(times=Date) ~ Value, data=df, FUN=mean)
and I got:
Error in convert.times(times., fmt) : format h:m:s may be incorrect In addition: Warning message: In convert.times(times., fmt) : NAs introduced by coercion
I am desperate at this pointand I am sorry for asking here. Maybe there is something wrong with my data since it was first an xlxs file and I converted the weird excel times into Dates in R but still... I am wondering since it is because some of the dates have the :59 seconds at the end. I can also post my entire data online if that's helpful. Many thanks!
The main issue is that you are trying to use rollapply
with a dataframe instead of a single column or a vector. If I understand your goal correctly, the following should do the job:
library(dplyr)
library(zoo)
df %>%
# compute rolling means with a window width of 3
mutate(means = rollmeanr(Value, k = 3, fill = NA)) %>%
# decrease the frequency in accordance with the window width
filter(seq_len(nrow(df)) %% 3 == 0) # or alternatively, slice(seq(3, nrow(df), 3))
# # A tibble: 3 x 3
# Date Value means
# <dttm> <dbl> <dbl>
# 1 2017-09-20 09:40:00 128. 113.
# 2 2017-09-20 10:10:00 208. 144.
# 3 2017-09-20 10:40:00 285. 224.
Data:
df <- structure(list(Date = structure(c(1505917199, 1505917800, 1505918400,
1505919000, 1505919600, 1505920200, 1505920800, 1505921400, 1505922000,
1505922599), class = c("POSIXct", "POSIXt"), tzone = ""), Value = c(96.51,
113.29, 128.37, 128.62, 94.08, 208.15, 178.82, 208.44, 285.49,
305.02)), row.names = c(NA, -10L), class = c("tbl_df", "tbl",
"data.frame"))