I have weather recorded at daily intervals. I need to make new columns based on existing columns mean_rh
, mean_temp
and mean_ws
. For each year, I need to calculate the exposure duration for each unique combination of temperature/humidity/wind speed values values. For instance, I would like to know how many days a mean_ws of 0.06
was recorded in 2006?
Example raw data
I can say that 0.06
was recorded for only one day.
Here is the reproducible example for the above data
library(dplyr)
# create example data frame
df <- data.frame(
year = c(2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006),
date = c("11/9/06", "12/9/06", "13/9/06", "14/9/06", "15/9/06", "16/9/06", "17/9/06", "18/9/06", "19/9/06", "20/9/06", "21/9/06", "22/9/06", "23/9/06", "24/9/06", "25/9/06", "26/9/06", "27/9/06", "28/9/06", "29/9/06", "30/9/06"),
mean_rh = c(69.49, 69.05, 68.47, 65.99, 65.53, 67.07, 72.42, 91.79, 90.13, 76.88, 67.68, 61.11, 70.19, 66.88, 76.63, 89.09, 84.97, 80.08, 79.82, 82.81),
mean_temp = c(18.05, 20.23, 20.93, 20.58, 18.64, 18.74, 19.13, 15.98, 14.62, 14.06, 17.03, 19.76, 18.64, 18.27, 16.96, 16.31, 14.97, 15.69, 16.34, 16.02),
mean_ws = c(0.84, 0.33, 0.46, 0.79, 2.25, 1.95, 0.13, 0.06, 1.10, 0.90, 1.10, 1.20, 0.33, 0.36, 0.27, 0.66, 0.22, 0.62, 0.33, 0.18)
)
# convert the date column to a date format
df$date <- as.Date(df$date, format = "%d/%m/%y")
Taking temperature as example, I want something like this (a sum of number of days for each temperature in each year)
I am using the following code, but my calculations are not correct, when I match them with the raw data. My code needs fixing.
# calculate the exposure duration for each unique `mean_temp` value
df2 <- df %>%
group_by(year, mean_temp) %>%
summarise(exposure_duration = difftime(max(date), min(date), units = "days")) %>%
ungroup()
# spread the data to have separate columns for each unique `mean_temp` value
df3 <- df2 %>%
spread(key = mean_temp, value = exposure_duration, fill = 0)
Above code is just for mean_temp
, but I would like to group_by
mean_ws
and mean_rh
too. Thank you!
Task as I understand: I would like to know how many days a mean_ws of 0.06 was recorded in 2006?
This can be done with xtabs
. As your data has only data for 2006
, I have changed the first years to 2005
.
df <- data.frame(
year = c(
2005, 2005, 2005, 2005, 2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006),
date = c("11/9/06", "12/9/06", "13/9/06", "14/9/06", "15/9/06", "16/9/06", "17/9/06", "18/9/06", "19/9/06", "20/9/06", "21/9/06", "22/9/06", "23/9/06", "24/9/06", "25/9/06", "26/9/06", "27/9/06", "28/9/06", "29/9/06", "30/9/06"),
mean_rh = c(69.49, 69.05, 68.47, 65.99, 65.53, 67.07, 72.42, 91.79, 90.13, 76.88, 67.68, 61.11, 70.19, 66.88, 76.63, 89.09, 84.97, 80.08, 79.82, 82.81),
mean_temp = c(18.05, 20.23, 20.93, 20.58, 18.64, 18.74, 19.13, 15.98, 14.62, 14.06, 17.03, 19.76, 18.64, 18.27, 16.96, 16.31, 14.97, 15.69, 16.34, 16.02),
mean_ws = c(0.84, 0.33, 0.46, 0.79, 2.25, 1.95, 0.13, 0.06, 1.10, 0.90, 1.10, 1.20, 0.33, 0.36, 0.27, 0.66, 0.22, 0.62, 0.33, 0.18)
)
# convert the date column to a date format
df$date <- as.Date(df$date, format = "%d/%m/%y")
xtabs(~year + mean_temp, data = df)
#> mean_temp
#> year 14.06 14.62 14.97 15.69 15.98 16.02 16.31 16.34 16.96 17.03 18.05 18.27
#> 2005 0 0 0 0 1 0 0 0 0 0 1 0
#> 2006 1 1 1 1 0 1 1 1 1 1 0 1
#> mean_temp
#> year 18.64 18.74 19.13 19.76 20.23 20.58 20.93
#> 2005 1 1 1 0 1 1 1
#> 2006 1 0 0 1 0 0 0