I have a big data frame including abundance of bats per year, and I would like to model the population trend over those years in R. I need to include year additionally as a random effect, because my data points aren't independent as bat population one year directly effects the population of the next year (if there are 10 bats one year they will likely be alive the next year). I have a big dataset, however have used the group_by() function to create a simpler dataframe shown below - example of dataframe lay out. In my bigger dataset I also have month and day.
year | total individuals |
---|---|
2000 | 39 |
2001 | 84 |
etc. | etc. |
Here is the model I wish to use with lme4.
BLE_glm6 <- glm(total_indv ~ year + (year|year), data = BLE_total, family = poisson)
Because year is the predictor variable, when adding year again R does not like it because it's highly correlated. So I am wondering, how do I account for the individuals one year directly affecting the number of individuals then next year if I can't include year as a random effect within the model?
There are a few possibilities. The most obvious would be to fit a Poisson model with the number of bats in the previous year as an offset:
## set up lagged variable
BLE_total <- transform(BLE_total,
total_indv_prev = c(NA, total_indv[-length(total_indv)])
## or use dplyr::lag() if you like tidyverse
glm(total_indv ~ year + offset(log(total_indv_prev)), data = BLE_total,
family = poisson)
This will fit the model
mu = total_indv_prev*exp(beta_0 + beta_1*year)
total_indv ~ Poisson(mu)
i.e. exp(beta_0 + beta_1*year)
will be the predicted ratio between the current and previous year. (See here for further explanation of the log-offset in a Poisson model.)
If you want year as a random effect (sorry, read the question too fast), then
library(lme4)
glmer(total_indv ~ offset(log(total_indv_prev)) + (1|year), ...)