Search code examples
rloopsautomationpurrrr-lavaan

Loop over combinations of column names with lavaan syntax


How can I loop over rows of a data frame containing variable names permutations when running a mediation analysis with lavaan?

Say I have 4 variables var1, var2, var3, var4:

df<- data.frame(var1 = rnorm(100), 
                var2 = rnorm(100), 
                var3 = rnorm(100),
                var4 = rnorm(100))

Using gtools::permutations() I save all possible permutations of the 4 variables in sets of 3:

permut <- 
  gtools::permutations(n = 4, r = 3, v = names(df), repeats.allowed = FALSE)

colnames(permut) <- c("Y", "X", "M")

> head(permut)
     Y      X      M     
[1,] "var1" "var2" "var3"
[2,] "var1" "var2" "var4"
[3,] "var1" "var3" "var2"
[4,] "var1" "var3" "var4"
[5,] "var1" "var4" "var2"
[6,] "var1" "var4" "var3"

Then I set the medation model using lavaan syntax, where I'm interested in the mediating effect of M over the relationship between X and Y:

mod <- "
    M ~ a * X
    Y ~ c * X + b * M
    ind := a*b
    tot := c + (a*b)
    "

I want to run the model and store it's results for future inspection:

library(lavaan)
library(dplyr)

#fit the model
fit <- sem(mod, df, se = "robust")

#save results
result <-
parameterestimates(fit) %>% filter(op != "~~")

My question here is this:

How to instruct R to use as Y,X,M variable names from each row of permut, fit the model using data from df and model syntax from mod and eventually store results for every model fit?

The code above is the simplest possible scenario that I'd like to use for running more complex models the same way.

I'm aware of answers about looping linear models of different variables: loop over all possible combinations, Looping over combinations of regression model terms, Linear Regression loop for each independent variable individually against dependent, and possibly the closest one: How to use reference variables by character string in a formula?, but still I'm stuck and coudn't solve this over the weekend.


Solution

  • Here's one way of doing it:

    fits <- apply(permut, 1, function (p) {
        permuted.df <- df[p]
        colnames(permuted.df) <- names(p)
        sem(mod, permuted.df, se="robust")
    })
    

    fits contains the SEM results for every 3-permutation in permut. To see the estimates of, e.g., the first fit, you can proceed as usual:

    > parameterestimates(fits[[1]]) %>% filter(op != "~~")
      lhs op     rhs label         est         se          z     pvalue    ci.lower
    1   M  ~       X     a -0.18393765 0.10977670 -1.6755618 0.09382406 -0.39909603
    2   Y  ~       X     c  0.07314372 0.09891034  0.7394952 0.45960637 -0.12071699
    3   Y  ~       M     b  0.01944518 0.08852450  0.2196587 0.82613697 -0.15405965
    4 ind :=     a*b   ind -0.00357670 0.01600038 -0.2235385 0.82311644 -0.03493686
    5 tot := c+(a*b)   tot  0.06956702 0.09816192  0.7086966 0.47851276 -0.12282680
        ci.upper
    1 0.03122074
    2 0.26700443
    3 0.19295001
    4 0.02778346
    5 0.26196084