I am quite new to R, and I am trying to find a way to average continuous data into a specific period of time.
My data is a month recording of several parameters with 1s time steps
The table via read.csv
has a date and time in one column and several other columns with values.
TimeStamp UTC Pitch Roll Heave(m)
05-02-13 6:45 0 0 0
05-02-13 6:46 0.75 -0.34 0.01
05-02-13 6:47 0.81 -0.32 0
05-02-13 6:48 0.79 -0.37 0
05-02-13 6:49 0.73 -0.08 -0.02
So I want to average the data in specific intervals: 20 min for example in a way that the average for hour 7:00, takes all the points from hour 6:41 to 7:00 and returns the average in this interval and so on for the entire dataset. The time interval will look like this :
TimeStamp
05-02-13 19:00 462
05-02-13 19:20 332
05-02-13 19:40 15
05-02-13 20:00 10
05-02-13 20:20 42
Here is a reproducible dataset similar to your own.
meteorological <- data.frame(
TimeStamp = rep.int("05-02-13", 1440),
UTC = paste(
rep(formatC(0:23, width = 2, flag = "0"), each = 60),
rep(formatC(0:59, width = 2, flag = "0"), times = 24),
sep = ":"
),
Pitch = runif(1440),
Roll = rnorm(1440),
Heave = rnorm(1440)
)
The first thing that you need to do is to combine the first two columns to create a single (POSIXct
) date-time column.
library(lubridate)
meteorological$DateTime <- with(
meteorological,
dmy_hm(paste(TimeStamp, UTC))
)
Then set up a sequence of break points for your different time groupings.
breaks <- seq(ymd("2013-02-05"), ymd("2013-02-06"), "20 mins")
Finally, you can calculate the summary statistics for each group. There are many ways to do this. ddply
from the plyr
package is a good choice.
library(plyr)
ddply(
meteorological,
.(cut(DateTime, breaks)),
summarise,
MeanPitch = mean(Pitch),
MeanRoll = mean(Roll),
MeanHeave = mean(Heave)
)