Search code examples
rm

Two-level modelling with lme in R


I am interested in estimating a mixed effect model with two random components (I am sorry for the somewhat unprecise notation. I am somewhat new to these kind of models). Finally, I also want also the standard errors of the variances of the random components. That is why I am somewhat boudn to using the package lme. The reason is that I found this description on how to calculate those standard errors and also interesting, the standard error for function of these variances link.

I believe I know how to use the package lmer. I am finally interested in model2. For the model1, both command yield the same estimates. But model2 with lme yields different results than model2 with lmer from the lme4 package. Could you help me to get around how to set up the random components for lme? This would be much appreciated. Thanks. Please find attached my MWE.

Best

Daniel

#### load all packages #####

loadpackage <- function(x){
  for( i in x ){
    #  require returns TRUE invisibly if it was able to load package
    if( ! require( i , character.only = TRUE ) ){
      #  If package was not able to be loaded then re-install
      install.packages( i , dependencies = TRUE )
    }
    #  Load package (after installing)
    library( i , character.only = TRUE )
  }
}

#  Then try/install packages...

loadpackage( c("nlme", "msm", "lmeInfo", "lme4"))
alcohol1 <- read.table("https://stats.idre.ucla.edu/stat/r/examples/alda/data/alcohol1_pp.txt", header=T, sep=",")
attach(alcohol1)

id <- as.factor(id)
age <- as.factor(age)


model1.lmer <-lmer(alcuse ~ 1  + peer + (1|id))
summary(model1.lmer)
model2.lmer <-lmer(alcuse ~ 1  + peer + (1|id) + (1|age))
summary(model2.lmer)

model1.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id, method ="REML")
summary(model1.lme)
model2.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id + 1|age, method ="REML")

Edit (15/09/2021):

Estimating the model as follows end then returning the estimates via nlme::VarCorr gives me different results. While the estimates seem to be in the ball park, it is as they are switched across components.

model2a.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id/age, method ="REML")
summary(model2a.lme)

nlme::VarCorr(model2a.lme)
            Variance     StdDev   
id =        pdLogChol(1)          
(Intercept) 0.38390274   0.6195989
age =       pdLogChol(1)          
(Intercept) 0.47892113   0.6920413
Residual    0.08282585   0.2877948

EDIT (16/09/2021):

Since Bob pushed me to think more about my model, I want to give some additional information. Please know that the data I use in the MWE do not match my true data. I just used it for illustrative purposes since I can not upload myy true data. I have a household panel with income, demographic informations and parent indicators.

I am interested in intergenerational mobility. Sibling correlations of permanent income are one industry standard. At the very least, contemporanous observations are very bad proxies of permanent income. Due to transitory shocks, i.e., classical measurement error, those estimates are most certainly attenuated. For this reason, we exploit the longitudinal dimension of our data.

For sibling correlations, this amounts to hypothesising that the income process is as follows:

$$Y_{ijt} = \beta X_{ijt} + \epsilon_{ijt}.$$

With Y being income from individual i from family j in year t. X comprises age and survey year indicators to account for life-cycle effects and macroeconmic conditions in survey years. Epsilon is a compund term comprising a random individual and family component as well as a transitory component (measurement error and short lived shocks). It looks as follows:

$$\epsilon_{ijt} = \alpha_i + \gamma_j + \eta_{ijt}.$$

The variance of income is then:

$$\sigma^2_\epsilon = \sigma^2_\alpha + \sigma^2\gamma + \sigma^2\eta.$$

The quantitiy we are interested in is

$$\rho = \frac(\sigma^2\gamma}{\sigma^2_\alpha + \sigma^2\gamma},$$

which reflects the share of shared family (and other characteristics) among siblings of the variation in permanent income.

B.t.w.: The struggle is simply because I want to have a standard errors for all estimates and for \rho.


Solution

  • Okay, finally. Just to sketch my confidential data: I have a panel of individuals. The data includes siblings, identified via mnr. income is earnings, wavey survey year, age age factors. female a factor for gender, pid is the factor identifying the individual.

    m1 <- lmer(income ~ age + wavey + female + (1|pid) + (1 | mnr),
            data = panel)
    vv <- vcov(m1, full = TRUE)
    
    covvar <- vv[58:60, 58:60]
    covvar
    3 x 3 Matrix of class "dgeMatrix"
         cov_pid.(Intercept) cov_mnr.(Intercept)   residual
    [1,]           2.6528679          -1.4624588 -0.4077576
    [2,]          -1.4624588           3.1015001 -0.0597926
    [3,]          -0.4077576          -0.0597926  1.1634680
    
    mean <- as.data.frame(VarCorr(m1))$vcov
    mean
    [1] 17.92341 16.86084 56.77185
    deltamethod(~ x2/(x1+x2), mean, covvar, ses =TRUE)
    [1] 0.04242089
    

    The last scalar should be what I interprete as the shared background of the siblings of permanent income.

    Thanks to @Ben Bolker who pointed me into this direction.