I have a very long data frame with 200 stations number. The sample data is given here.
Let the sample data bedf
.Now
I would like to check the auto correlation at lag 1 for each station number. Perform pre-whitening and calculate Mann-kendall trend for each stations after pre-whitening. I can do for one individual stations using the code below.
Would you kindly help me how i can perform this for all the stations at once.
Dataframe df
dput(df)
structure(list(stn_num = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), .Label = c("08BB005", "08CE001", "08CF003"), class = "factor"),
year = c(1987L, 1988L, 1989L, 1990L, 1991L, 1992L, 1993L,
1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 1980L, 1981L, 1982L,
1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L,
1992L, 1993L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L,
1991L, 1992L, 1993L, 1994L), value = c(411.2146215, 346.9846995,
453.8616438, 435.3561644, 421.4019178, 444.7603825, 454.469589,
441.5884932, 339.76, 294.9562842, 371.8939726, 321.7016438,
337.7627397, 460.6622951, 513.1084932, 385.4580822, 386.6643836,
377.9076503, 440.7849315, 407.7731507, 454.4967123, 458.3259563,
421.4032877, 449.3890411, 456.3934247, 450.015847, 400.0569863,
1331.70765, 1415.484932, 1589.654795, 1606.709589, 1750.002732,
1803.646575, 1729.054795, 1802.509589, 1805.469945, 1711.854795,
1574.153425)), .Names = c("stn_num", "year", "value"), class = "data.frame", row.names = c(NA,
-38L))
Code i have used for individual station's calculation
c<-acf(df$value,lag.max=1)
dim(c$acf)
c$acf[[2,1,1]]
df$prewhit1<-c$acf[[2,1,1]]*df$value
prewhitseries<-data.frame(with(df, (df$value[-1] - prewhit1[-length(prewhit1)])))
autocordata<-cbind(df,prewhitseries)
MannKendall(autocordata$prewhitseries)
So how i can perform the prewhitening and mankendall test for all the station number on the same dataframe at once. Thank you.
My above comments aside I think this will get you what you're looking for:
stationList <- unique(df$stn_num)
resultsList <- vector("list", length(stationList))
for(i in stationList){
tempDF <- df[df$stn_num == i, ]
c<-acf(tempDF$value,lag.max=1)
t <- dim(c$acf)
tempDF$prewhit1<-c$acf[[t[1], t[2], t[3]]]*tempDF$value
prewhitseries<-data.frame(with(tempDF, (tempDF$value[-1] - prewhit1[-length(prewhit1)])))
autocordata<-cbind(tempDF[-1,],prewhitseries)
resultsList[[grep(i, stationList)]] <- MannKendall(autocordata[,5])
}
names(resultsList) <- stationList
I arbitrarily removed a row from the tempDF I create in the loop so the cbind command will actually work I'm not sure what you actually want to do there. You could get the same result with something from the apply family which might be the direction you want to go if you're trying to parallelize or need more efficiency.