How do I run a thornthwaite function (which computes standard precipitation index) by location and latitude in R?

I have the following data:

dat <- read.table(text="
  id YEAR    MONTH   TMED    PRCP    lat
   1  1986    1      -14.5    2.3    42.4863
   1  1986    2      -13.9    5.7    42.4863
   2  1986    1      -12.9    7.2    42.46
   2  1986    2      -11.6    19.7   42.46", header=TRUE)

where

id is my location unit ranging from 1 to 90
TMED - mean monthly temperature by location
PRCP - precipitation by location
lat - latitude by location
YEAR ranges from 1986 to 2016
MONTH ranges from 1 to 12

I need to run the following function in R to calculate SPEI index (Standard Precipitation-Evapotranspiration Index) for each location:

library(SPEI)

dat$PET <- thornthwaite(dat$TMED, dat$lat[1])

dat$BAL <- dat$PRCP-dat$PET

spei1 <- spei(dat$BAL, scale = 1)

dat$spei1=l

This code works for one location. But I need to make a loop over latitude and location. One problem is that latitude should enter the function thornthwaite as a number (not as a list/variable).

Solution

I played around with the Thornthwaite equation a bit for my ecology masters, and this implementation appears a bit strange. Despite how it might appear here, the equation needs more than just mean temperature and latitudes as input. It actually needs the average daylength of a given month, but this can be calculated from latitude and date, and thornthwaite() gets the date by just assuming the first datapoint represents January, and the rest follows in sequence. The Thornthwaite equation also depends on a yearly heat index, which means you need monthly temperature means for the entire year. thornthwaite() solves this by aggregating over the temperature vector you supply.

In summation, for thornthwaite() to work you need a sequence of monthly mean temperatures, in sequence, starting in January and spanning at least one year. As such, the function won't work on the data you supplied.

I suggest you make sure your series is long enough, and also split it into separate data.frames for each location. You can use split() for this (split(dat, dat$id)).

There are a few examples in ?thornthwaite, including one demonstrating its use on time series, useful if your series doesn't start on January.

I made a mockup demonstrating one possible approach:
(Notice the function will return values even if the data doesn't cover a full year, these values will then be quite unreliable.)

dat <- read.table(text="
  id YEAR    MONTH   TMED    PRCP    lat
   1  1986    1     -14.5    2.3    42.4863
   1  1986    2     -13.9    5.7    42.4863
   1  1986    3     -10.5    2.3    42.4863
   1  1986    4      -7.9    5.7    42.4863
   1  1986    5      -4.5    2.3    42.4863
   1  1986    6       0.9    5.7    42.4863
   1  1986    7      10.5    2.3    42.4863
   1  1986    8      17.9    5.7    42.4863
   2  1986    1     -12.9    7.2    42.46
   2  1986    2     -11.6   19.7    42.46
   2  1986    3      -8.9    7.2    42.46
   2  1986    4      -5.9    7.2    42.46
   2  1986    5       1.6   19.7    42.46
   2  1986    6      12.9    7.2    42.46
   2  1986    7      21.6   19.7    42.46
   2  1986    8      25.6   19.7    42.46", header=TRUE)

dat.s <- split(dat, dat$id)

lapply(dat.s, function(x) thornthwaite(x$TMED, x$lat[1]))