Search code examples
rformula

creating a Instrument variable formula form formula object


Given a formula object like:

obj1 <- formula(y~x1+x2+x3)

I want to create a formula for an instrumental variable approach ( e.g.: aer::ivreg), replacing x1 by z1.

I want to achieve obj2 <- formula(y~x1+x2+x3|z1+x2+x3) but not doing it manually (as in my original data I have more x, which I want to exchange in some estimations.)


Solution

  • Formulas are language objects. You can achieve this with computing on the language. It is useful to look at the abstract syntax tree to investigate the nested calls.

    obj1 <- y~x1+x2+x3
    lobstr::ast(!!obj1)
    # █─`~` 
    # ├─y 
    # └─█─`+` 
    #   ├─█─`+` 
    #   │ ├─x1 
    #   │ └─x2 
    #   └─x3
    
    obj2 <- y~x1+x2+x3|z1+x2+x3
    lobstr::ast(!!obj2)
    # █─`~` 
    # ├─y 
    # └─█─`|` 
    #   ├─█─`+` 
    #   │ ├─█─`+` 
    #   │ │ ├─x1 
    #   │ │ └─x2 
    #   │ └─x3 
    #   └─█─`+` 
    #     ├─█─`+` 
    #     │ ├─z1 
    #     │ └─x2 
    #     └─x3 
    
    obj1[[3]] <- call("|", obj1[[3]], obj1[[3]])
    obj1[[3]][[3]][[2]][[2]] <- quote(z1) #or as.name("z1")
    obj1
    #y ~ x1 + x2 + x3 | z1 + x2 + x3
    all.equal(obj1, obj2)
    #[1] TRUE
    

    Edit:

    A more general solution that crawls the call tree:

    crawler <- function(e, f, r) {
      if (e == as.name(f)) return(as.name(r))
      if (length(e) == 1L) return(e)
      for (i in seq_along(e)) e[[i]] <- crawler(e[[i]], f = f, r = r)
      e
    }
    
    obj1 <- y~x1+x2+x3
    obj1[[3]] <- call("|", obj1[[3]], obj1[[3]])
    obj1[[3]][[3]] <- crawler(obj1[[3]][[3]], "x1", "z1")
    obj1
    #y ~ x1 + x2 + x3 | z1 + x2 + x3