I would love some help calculating the time since the temperature was as cold as it was on a particular date.
So in the example data frame below, for the first record (01/07/2000) the previous time it was as cold as this (-1) was 01/01/2000 (around 182 days before).
for the second record, (01/06/2000) the previous time it was that cold (2 degrees) was the previous month (01/05/2000) where it was actually colder (1 degree) (so around 30 days before).
df <- data.frame(date=as.Date(c("01/07/2000", "01/06/2000", "01/05/2000",
"01/04/2000", "01/03/2000", "01/02/2000",
"01/01/2000"), "%d/%m/%Y"),
temperature =c(-1, 2, 1, 0, 1, 1, -1))
I have tried modifying this approach (Calculate days since last event in R) but found it became unwieldy when calculating for each week.
Any ideas how you might calculate the number of days since the weather was that cold, for each week? Many thanks, indeed for your help.
Supposed you have temperature data of different grids like this,
# date grid temp
# 1 2000-01-01 A -1
# 2 2000-02-01 A -1
# 3 2000-03-01 A -1
# ...
# 10 2000-01-01 B 2
# 11 2000-02-01 B 1
# ...
You could do a split-apply-combine approach along the grids using by
. In each grid unit, we apply a Vectorize
d function, that calculates the diff
erence in days since the previous occurrence of the temperature of a specific date. If there is no event before it gives NA
.
f <- Vectorize(function(data, x) {
diff(rev(with(data, date[date <= x & temp == temp[date == x]]))[2:1])
}, vectorize.args="x")
res <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f(g, g$date))))
res
# date grid temp last
# A.1 2000-01-01 A -1 NA
# A.2 2000-02-01 A -1 31
# A.3 2000-03-01 A -1 29
# A.4 2000-04-01 A -1 31
# A.5 2000-05-01 A 0 NA
# A.6 2000-06-01 A 2 NA
# A.7 2000-07-01 A 0 61
# A.8 2000-08-01 A 0 31
# A.9 2000-09-01 A -1 153
# B.10 2000-01-01 B 2 NA
# B.11 2000-02-01 B 1 NA
# B.12 2000-03-01 B 2 60
# B.13 2000-04-01 B 1 60
# B.14 2000-05-01 B 2 61
# B.15 2000-06-01 B -1 NA
# B.16 2000-07-01 B -1 30
# B.17 2000-08-01 B 0 NA
# B.18 2000-09-01 B 2 123
# C.19 2000-01-01 C 0 NA
# C.20 2000-02-01 C 0 31
# C.21 2000-03-01 C 1 NA
# C.22 2000-04-01 C 1 31
# C.23 2000-05-01 C -1 NA
# C.24 2000-06-01 C -1 31
# C.25 2000-07-01 C 1 91
# C.26 2000-08-01 C 2 NA
# C.27 2000-09-01 C -1 92
To find out when the temperature was below a specific temperature threshold temp.th
we could modify the function like so:
temp.th <- 0
f2 <- Vectorize(function(data, x) {
x - rev(with(data, date[date <= x & temp < temp.th]))[1]
}, vectorize.args="x")
res2 <- do.call(rbind, by(d, d$grid, function(g) cbind(g, last=f2(g, g$date))))
res2
# date grid temp last
# A.1 2000-01-01 A -1 0
# A.2 2000-02-01 A -1 0
# A.3 2000-03-01 A -1 0
# A.4 2000-04-01 A -1 0
# A.5 2000-05-01 A 0 30
# A.6 2000-06-01 A 2 61
# A.7 2000-07-01 A 0 91
# A.8 2000-08-01 A 0 122
# A.9 2000-09-01 A -1 0
# B.10 2000-01-01 B 2 NA
# B.11 2000-02-01 B 1 NA
# B.12 2000-03-01 B 2 NA
# B.13 2000-04-01 B 1 NA
# B.14 2000-05-01 B 2 NA
# B.15 2000-06-01 B -1 0
# B.16 2000-07-01 B -1 0
# B.17 2000-08-01 B 0 31
# B.18 2000-09-01 B 2 62
# C.19 2000-01-01 C 0 NA
# C.20 2000-02-01 C 0 NA
# C.21 2000-03-01 C 1 NA
# C.22 2000-04-01 C 1 NA
# C.23 2000-05-01 C -1 0
# C.24 2000-06-01 C -1 0
# C.25 2000-07-01 C 1 30
# C.26 2000-08-01 C 2 61
# C.27 2000-09-01 C -1 0
Data:
d <- expand.grid(date=seq(as.Date("2000-01-01"), as.Date("2000-09-01"), by="month"),
grid=LETTERS[1:3])
set.seed(42)
d$temp <- sample(-1:2, nrow(d), replace=T)