How to use texreg after clmm (I want to extract random effect components)

I am re-writing this posting based on my progress after having advices from @PhilipLeifeld (see the comment section below).

I have tried to put clmm outputs to latex, using texreg. Since the package does not support clmm in its default mode, I tried to extend the package with extract function (see the answer part on Print "beautiful" tables for h2o models in R). Meanwhile, I found that code posted on https://gist.github.com/kjgarza/340201f6564ca941fe25 can be used as a starting point for me; I shall refer the code as the baseline code below. The following model (result) is pretty much representative of my actual codes.

library(ordinal)
library(texreg)
d<-data.frame(wine)
result<-clmm(rating~ 1+temp+contact+(1+temp|judge), data=d)

What I would like to display in a latex table are random effects components, which are omitted in the baseline code. The following is part of summary output.

summary(result)

Random effects:
 Groups Name        Variance Std.Dev. Corr  
 judge  (Intercept) 1.15608  1.0752         
        tempwarm    0.02801  0.1674   0.649 
 Number of groups:  judge 9

Specifically, I want to display variance (and number of groups); I do not need correlation parts. While working on the baseline code, I also learned that "texreg" allows for only a limited set of arguments for a latex display and that the option of "include.variance" is relevant to my goal. Thus, I tried to add random effects components to a "gof" argument as including the option of "include.variance" in the baseline code.

Here is what I have done. First, I add "include.variance" to the part of defining extract.clmm function.

extract.clmm <- function(model, include.thresholds = TRUE, include.aic = TRUE, 
                     include.bic = TRUE, include.loglik = TRUE, include.variance = TRUE, oddsratios = TRUE, conf.level= 0.95, include.nobs = TRUE, ...) {
s <- summary(model, ...)

tab <- s$coefficients
thresh <- tab[rownames(tab) %in% names(s$alpha), ]
threshold.names <- rownames(thresh)
threshold.coef <- thresh[, 1]
threshold.se <- thresh[, 2]
threshold.pval <- thresh[, 4]
beta <- tab[rownames(tab) %in% names(s$beta), ]
beta.names <- rownames(beta)
beta.coef <- beta[, 1]
beta.se <- beta[, 2]
beta.pval <- beta[, 4]

Then, I added the following three lines.

### for random effect components###   
rand<-s$ST[[1]]
rand.names<-rownames(rand)
rand.var<-rand[,1]

The following part is what I additionally included to the baseline code ("include.variance").

if (include.variance == TRUE) {       
    gof.names <- c(gof.names, rand.names)
    gof <- c(gof, rand)
    gof.decimal <- c(gof.decimal, TRUE)   
}

After running the extract.clmm function, I ran the following.

test<-extract.clmm(result, include.variance=TRUE, oddsratios=FALSE)

Then, I got an error message: Error in validityMethod(object) : gof.names and gof must have the same length! While I found that the lengths of "rand" and "rand.names" in the case of "result" is 4 and 2, I do not know how to handle this. Any comments would be very appreciated. Thanks in advance.

Solution

Let's first rewrite your test case such that it contains both a model with random effects (clmm) and a model without random effects (clm), both from the ordinal package. This will allow us to check if the extract.clmm function we are about to write yields results that are formatted in a compatible way with the existing extract.clm function in the texreg package:

library("ordinal")
library("texreg")
d <- data.frame(wine)
result.clmm <- clmm(rating ~ 1 + temp + contact + (1 + temp|judge), data = d)
result.clm <- clm(rating ~ 1 + temp + contact, data = d)

The existing clm method for the generic extract function in texreg looks like this, and we will be able to use it as a template for writing a clmm method as both object types are structured in similar ways:

# extension for clm objects (ordinal package)
extract.clm <- function(model, include.thresholds = TRUE, include.aic = TRUE, 
    include.bic = TRUE, include.loglik = TRUE, include.nobs = TRUE, ...) {
  s <- summary(model, ...)

  tab <- s$coefficients
  thresh <- tab[rownames(tab) %in% names(s$aliased$alpha), , drop = FALSE]
  threshold.names <- rownames(thresh)
  threshold.coef <- thresh[, 1]
  threshold.se <- thresh[, 2]
  threshold.pval <- thresh[, 4]
  beta <- tab[rownames(tab) %in% names(s$aliased$beta), , drop = FALSE]
  beta.names <- rownames(beta)
  beta.coef <- beta[, 1]
  beta.se <- beta[, 2]
  beta.pval <- beta[, 4]
  if (include.thresholds == TRUE) {
    names <- c(beta.names, threshold.names)
    coef <- c(beta.coef, threshold.coef)
    se <- c(beta.se, threshold.se)
    pval <- c(beta.pval, threshold.pval)
  } else {
    names <- beta.names
    coef <- beta.coef
    se <- beta.se
    pval <- beta.pval
  }

  n <- nobs(model)
  lik <- logLik(model)[1]
  aic <- AIC(model)
  bic <- BIC(model)
  gof <- numeric()
  gof.names <- character()
  gof.decimal <- logical()
  if (include.aic == TRUE) {
    gof <- c(gof, aic)
    gof.names <- c(gof.names, "AIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.bic == TRUE) {
    gof <- c(gof, bic)
    gof.names <- c(gof.names, "BIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.loglik == TRUE) {
    gof <- c(gof, lik)
    gof.names <- c(gof.names, "Log Likelihood")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.nobs == TRUE) {
    gof <- c(gof, n)
    gof.names <- c(gof.names, "Num.\ obs.")
    gof.decimal <- c(gof.decimal, FALSE)
  }

  tr <- createTexreg(
      coef.names = names, 
      coef = coef, 
      se = se, 
      pvalues = pval, 
      gof.names = gof.names, 
      gof = gof, 
      gof.decimal = gof.decimal
  )
  return(tr)
}

setMethod("extract", signature = className("clm", "ordinal"), 
    definition = extract.clm)

The first difference for clmm objects is that the coefficients etc. are not stored under summary(model)$aliased$alpha and summary(model)$aliased$beta, but directly under summary(model)$alpha and summary(model)$beta.

The second thing we need to do is add goodness-of-fit elements for the number of groups and the random variances.

The number of groups is apparently stored under summary(model)$dims$nlev.gf, with multiple entries for the different conditioning variables. So that's easy.

The random variances are not stored anywhere, so we need to look this up in the source code of the ordinal package. We can see there that the print.summary.clmm function uses an internal helper function called formatVC to print the variances. This function is contained in the same R script and basically just does the formatting and calls another internal helper function called varcov (also contained in the same file) to compute the variances. That function, in turn, computes the transposed crossproduct of model$ST to get the variances. We can simply do the same thing directly in the GOF block of our extract.clmm function, e.g., using diag(s$ST[[1]] %*% t(s$ST[[1]])) for the first random effect. We just have to make sure we do that for all random effects, which means we need to put this in a loop and replace [[1]] by an iterator like [[i]].

The final clmm method for the extract function could look like this:

# extension for clmm objects (ordinal package)
extract.clmm <- function(model, include.thresholds = TRUE,
    include.loglik = TRUE, include.aic = TRUE,  include.bic = TRUE,
    include.nobs = TRUE, include.groups = TRUE, include.variance = TRUE, ...) {
  s <- summary(model, ...)

  tab <- s$coefficients
  thresh <- tab[rownames(tab) %in% names(s$alpha), ]
  threshold.names <- rownames(thresh)
  threshold.coef <- thresh[, 1]
  threshold.se <- thresh[, 2]
  threshold.pval <- thresh[, 4]
  beta <- tab[rownames(tab) %in% names(s$beta), ]
  beta.names <- rownames(beta)
  beta.coef <- beta[, 1]
  beta.se <- beta[, 2]
  beta.pval <- beta[, 4]

  if (include.thresholds == TRUE) {
    cfnames <- c(beta.names, threshold.names)
    coef <- c(beta.coef, threshold.coef)
    se <- c(beta.se, threshold.se)
    pval <- c(beta.pval, threshold.pval)
  } else {
    cfnames <- beta.names
    coef <- beta.coef
    se <- beta.se
    pval <- beta.pval
  }

  gof <- numeric()
  gof.names <- character()
  gof.decimal <- logical()
  if (include.loglik == TRUE) {
    lik <- logLik(model)[1]
    gof <- c(gof, lik)
    gof.names <- c(gof.names, "Log Likelihood")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.aic == TRUE) {
    aic <- AIC(model)
    gof <- c(gof, aic)
    gof.names <- c(gof.names, "AIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.bic == TRUE) {
    bic <- BIC(model)
    gof <- c(gof, bic)
    gof.names <- c(gof.names, "BIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.nobs == TRUE) {
    n <- nobs(model)
    gof <- c(gof, n)
    gof.names <- c(gof.names, "Num.\ obs.")
    gof.decimal <- c(gof.decimal, FALSE)
  }
  if (include.groups == TRUE) {
    grp <- s$dims$nlev.gf
    grp.names <- paste0("Groups (", names(grp), ")")
    gof <- c(gof, grp)
    gof.names <- c(gof.names, grp.names)
    gof.decimal <- c(gof.decimal, rep(FALSE, length(grp)))
  }
  if (include.variance == TRUE) {
    var.names <- character()
    var.values <- numeric()
    for (i in 1:length(s$ST)) {
      variances <- diag(s$ST[[i]] %*% t(s$ST[[i]]))
      var.names <- c(var.names, paste0("Variance: ", names(s$ST)[[i]], ": ", 
          names(variances)))
      var.values <- c(var.values, variances)
    }
    gof <- c(gof, var.values)
    gof.names <- c(gof.names, var.names)
    gof.decimal <- c(gof.decimal, rep(TRUE, length(var.values)))
  }

  tr <- createTexreg(
      coef.names = cfnames, 
      coef = coef, 
      se = se, 
      pvalues = pval, 
      gof.names = gof.names, 
      gof = gof, 
      gof.decimal = gof.decimal
  )
  return(tr)
}

setMethod("extract", signature = className("clmm", "ordinal"), 
    definition = extract.clmm)

You can just execute the code at runtime and texreg should be able to create tables from clmm objects, including random variances. I will add this code to the next texreg release.

You can apply this to your example as follows:

screenreg(list(result.clmm, result.clm), single.row = TRUE)

The result is compatible across clmm and clm objects, as you can see here in the output:

==================================================================
                              Model 1            Model 2          
------------------------------------------------------------------
tempwarm                        3.07 (0.61) ***    2.50 (0.53) ***
contactyes                      1.83 (0.52) ***    1.53 (0.48) ** 
1|2                            -1.60 (0.69) *     -1.34 (0.52) ** 
2|3                             1.50 (0.60) *      1.25 (0.44) ** 
3|4                             4.22 (0.82) ***    3.47 (0.60) ***
4|5                             6.11 (1.02) ***    5.01 (0.73) ***
------------------------------------------------------------------
Log Likelihood                -81.55             -86.49           
AIC                           181.09             184.98           
BIC                           201.58             198.64           
Num. obs.                      72                 72              
Groups (judge)                  9                                 
Variance: judge: (Intercept)    1.16                              
Variance: judge: tempwarm       0.03                              
==================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

You can use arguments include.variances == FALSE and include.groups == FALSE to switch off the reporting of the variances and group sizes if you want.