Search code examples
rdataframedplyrdata-manipulationdata-mining

Calculate daily maximum, minimum and mean data from weather data recorded at 15 minutes interval


Our weather station recorded temperature data at 15 minutes interval. I'd like to calculate daily minimum, daily maximum and daily mean temperatures. How can I do this in R? I believe 15 minutes weather data can be easily converted to daily data but I can't seem to find the code. Here is the glimpse of data

enter image description here

Here is the reproducible example

df <- structure(list(date = structure(
  c(
    1401104700,
    1401105600,
    1401106500,
    1401107400,
    1401108300,
    1401109200,
    1401110100,
    1401111000,
    1401111900,
    1401112800,
    1401113700,
    1401114600,
    1401115500,
    1401116400,
    1401117300,
    1401118200,
    1401119100,
    1401120000,
    1401120900,
    1401121800,
    1401122700,
    1401123600,
    1401124500,
    1401125400,
    1401126300,
    1401127200,
    1401128100,
    1401129000,
    1401129900,
    1401130800,
    1401131700,
    1401132600,
    1401133500,
    1401134400,
    1401135300,
    1401136200,
    1401137100,
    1401138000,
    1401138900,
    1401139800,
    1401140700,
    1401141600,
    1401142500,
    1401143400,
    1401144300,
    1401145200,
    1401146100,
    1401147000,
    1401147900,
    1401148800,
    1401149700,
    1401150600,
    1401151500,
    1401152400,
    1401153300,
    1401154200,
    1401155100,
    1401156000,
    1401156900,
    1401157800,
    1401158700,
    1401159600,
    1401160500,
    1401161400,
    1401162300,
    1401163200,
    1401164100,
    1401165000,
    1401165900,
    1401166800,
    1401167700,
    1401168600,
    1401169500,
    1401170400,
    1401171300,
    1401172200,
    1401173100,
    1401174000,
    1401174900,
    1401175800,
    1401176700,
    1401177600,
    1401178500,
    1401179400,
    1401180300,
    1401181200,
    1401182100,
    1401183000,
    1401183900,
    1401184800,
    1401185700,
    1401186600,
    1401187500,
    1401188400,
    1401189300,
    1401190200,
    1401191100,
    1401192000,
    1401192900,
    1401193800
  ),
  tzone = "UTC",
  class = c("POSIXct", "POSIXt")
),
temperature = c(
  25,
  25.2,
  25.3,
  25.1,
  25.4,
  26,
  25.9,
  25.6,
  26.8,
  27.8,
  26.8,
  26,
  26,
  26.3,
  27,
  27,
  26.2,
  25.8,
  24.9,
  25.1,
  26.3,
  25.6,
  25.3,
  25.2,
  25.1,
  24.8,
  24.7,
  24,
  23,
  22.7,
  22.5,
  22.5,
  22.2,
  21.9,
  21.5,
  21.1,
  20.8,
  20.5,
  20.3,
  20.3,
  20.2,
  20,
  19.8,
  19.6,
  19.2,
  19.1,
  19.1,
  18.9,
  18.8,
  18.6,
  18.3,
  18.2,
  18.2,
  18.2,
  18.1,
  17.9,
  17.8,
  17.7,
  17.8,
  18,
  18.1,
  18,
  18.1,
  18.6,
  18.7,
  18.5,
  18.3,
  18.1,
  18.1,
  18.6,
  18.8,
  18.6,
  18.6,
  18.3,
  18.2,
  18,
  17.8,
  18,
  18.2,
  18.9,
  19.8,
  19.6,
  19.5,
  19.7,
  20.2,
  21.5,
  22.4,
  23,
  24,
  23.3,
  23.2,
  23.7,
  24.5,
  24.8,
  24.9,
  26.3,
  25.7,
  24.9,
  24.9,
  26
)), row.names = c(NA,-100L), class = c("tbl_df", "tbl", "data.frame"))

df$date = as.Date(df$date)

Thanks for any assistance.


Solution

  • Based on the image showed, the dates are in month/day/year hour:minutes format, which could be converted to days by converting to Date class, do the grouping by the Date and then get the min/max/mean of temperature

    library(dplyr)
    library(lubridate)
    df1 %>%
        group_by(date = as.Date(mdy_hm(date))) %>%
        summarise(min_temp = min(temperature, na.rm = TRUE),
                  max_temp = max(temperature, na.rm = TRUE),
          mean_temp = mean(temperature, na.rm = TRUE), .groups = 'drop')
    

    Or using data.table

    library(data.table)
    setDT(df1)[, .(min_temp = min(temperature, na.rm = TRUE),
                  max_temp = max(temperature, na.rm = TRUE),
                  mean_temp = mean(temperature, na.rm = TRUE)),
                by = .(date = as.IDate(date, "%m/%d/%Y"))]