Search code examples

Derive daily mean from high frequency measures

I have 15-min observations of air temperature Temp data that I would like to derive a daily mean for. I have been trying to use the openair package with the timeAverage function to get to this answer because it allows the user to set a minimum threshold for the amount of missing data that is allowed before it will derive a daily mean. However I keep getting the same error message associated with not being able to find the variable date.

Here is an example of my dataframe

> head(dat)
              Date Temp
1: 2001-01-01 0:00   NA     
2: 2001-01-01 0:15 -1.4
3: 2001-01-01 0:30 -1.1
4: 2001-01-01 0:45 -1.1
5: 2001-01-01 1:00 -0.9
6: 2001-01-01 1:15 -0.5

here is the code I have been using

dailyAVG <- timeAverage(mydata = dat,
                        avg.time = "day",
                        data.thresh = 75,
                        statistic = "mean",
               = "2001-01-01 0:00")

which produces this error message

Can't find the variable(s) date 
Error in checkPrep(mydata, vars, type = "default", remove.calm = FALSE,  : 


  • I found a hacky way to get around the error message. What I did was rename the Date column in MS Excel to date. I then, also in MS Excel, reformatted the date column to be in format yyyy-mm-dd hh:mm. After making these corrections I read the .csv file into R and made the following adjustments.

    This first correction, get the date format into one that openair likes

    dat$date <- as.POSIXct(dat$date, tz = "", "%Y-%m-%d %H:%M")

    After fixing the date format, I ran into another issue related to my temperature Temp measurements. For some reason, R was seeing the values as class character, when they should be numeric. This was fixed with

    dat$Temp <- as.numeric(dat$Temp)

    After making these corrections, the function timeAverage worked using the following code

    dailyAVG <- timeAverage(mydata = dat,
                            avg.time = "day",
                            data.thresh = 75)
    > dailyAVG
    # A tibble: 2 x 2
      date                 Temp
      <dttm>              <dbl>
    1 2001-01-01 00:00:00  3.01
    2 2001-01-02 00:00:00  1.85