Search code examples
predictlme4lmer

Using lme4 modeling to predict from fixed effects values


I apologize for the novice question, but am new to lme4. I am using lme4 to model the survival of bee colonies among six sites composed of varying types of land use over three years and have produced the following model after already eliminating other competing models using REML:

land1=lmer(asin(sqrt(prop_survival))~log(area_forage_uncult) + (1|site) + (1|year))

And produced the summary:

Linear mixed model fit by REML ['lmerMod']
Formula: asin(sqrt(prop_survival)) ~ log(area_forage_uncult) + (1 | site)+ (1 | year))

REML criterion at convergence: -32.7

Scaled residuals: 
Min      1Q  Median      3Q     Max 
-1.4914 -0.5867 -0.0323  0.4945  1.7873 

Random effects:
 Groups   Name        Variance Std.Dev.
 site     (Intercept) 0.001080 0.03287 
 year     (Intercept) 0.000000 0.00000 
 Residual             0.004983 0.07059 
Number of obs: 18, groups:  site, 6; year, 3

Fixed effects:
                    Estimate Std. Error t value
(Intercept)             -1.33426    0.62653  -2.130
log(area_forage_uncult)  0.13687    0.03618   3.783

Correlation of Fixed Effects:
        (Intr)
lg(r_frg_n) -0.999

What I would now like to do is to use this model to predict survival of apiaries given other amounts of uncultivated forage. What would be the best way to do so? Example code would be very helpful.


Solution

  • This should be fairly straightforward (although it would me more straightforward with a reproducible example ...)

    If you have a fitted model land1, then

    ## I'm picking arbitrary values here since I don't
    ##  know what's sensible for your system
    pframe <- data.frame(area_forage_uncult=200:210)
    predict(land1,newdata=pframe,re.form=~0)
    

    The argument re.form=~0 tells the predict() function that you want to make predictions at the population level, not for any specific year or site (i.e. set random-effects values to zero when predicting). For more information, see ?predict.merMod.

    I have several other suggestions about the model:

    • consider using a binomial GLMM (if your proportion survival is out of a total number of known exposures) rather than arcsine-square-root transforming (see Warton and Hui 2011)
    • 6 sites is a fairly small number of random-effect levels, and 3 is very small; your output shows that the among-year variance has been set to zero. Consider setting year as a fixed effect (possibly with sum-to-zero contrasts, i.e. specify contrasts=list(year=contr.sum)).