I am modelling the number of jobs over month 1-112 (taken over ~10 years) for each Hospital. The number of jobs over time vary depending on the hospital so I have defined the following multilevel model as a starting point:
glmer.nb(Jobs ~ 1 + Region + Month + ( Month | factor(Region)),
data = df_month_region,
family = poisson(link = "log"))
My data looks very similar to this:
df <- data.frame(
Region = rep(1:14, each=112),
Month = rep(seq(1,112,1),14),
Job = rpois(112*14, 0.7)
)
I would like to know:
1) Does modelling my data in this format makes sense? Would it make more sense to have a column for year and month separately?
2) How do I overcome this error: "Model failed to converge with max|grad| = 0.00361688 (tol = 0.001, component 1)Model is nearly unidentifiable: very large eigenvalue"? - I have followed the steps here: https://rstudio-pubs-static.s3.amazonaws.com/33653_57fc7b8e5d484c909b615d8633c01d51.html as commonly recommended on this site but I am starting to think the underlying problem lies perhaps with how I have setup my dataframe or my model?
Okay, so based on the additional information you provided in your comment, I'm going to expand upon Pablo's answer. Since you are really only interested in the change in number of jobs over time, your only fixed effect should be Month. Now, you also say that you have the different hospitals and the regions those hospitals are in. That means that you need to have a nested random effects structure, where you have different hospitals that belong to the different regions. You can read more about nested random effects here:
http://errickson.net/stats-notes/vizrandomeffects.html
So, the final model that you should run will look like this:
job_model <- glmer(Jobs ~ Month + (1|Region/Hospital),
data = df_month_region,
family = poisson(link = "log"))
In order to see if Month
improves the model significantly, you should also fit the following model for comparison.
job_model_null <- glmer(Jobs ~ 1 + (1|Region/Hospital),
data = df_month_region,
family = poisson(link = "log"))
And then you compare them with the likelihood ratio test using the anova()
function like so:
anova(job_model, job_model_null)
EDIT: If you wish to fit a random slope for Month to the random effects, it would look like this:
job_model <- glmer(Jobs ~ Month + (1+Month|Region/Hospital),
data = df_month_region,
family = poisson(link = "log"))