Search code examples
rfixest

How to update a formula object containing a pipe ("|")


Some packages such as fixest use the pipe to separate explanatory variables from fixed effects in formulas. How can I use update() to modify each of those parts? In the example below, I would like to remove z from the right-hand side but keep the fixed effects fe.

f <- log(y) ~ log(x) + z | fe

# does not work
f |> update(~ . - z)
#> log(y) ~ (log(x) + z | fe)
f |> update(~ . - z | .)
#> log(y) ~ ((log(x) + z | fe) - z | (log(x) + z | fe))

Created on 2023-02-26 with reprex v2.0.2

Apparently, the fixest authors have only created an update() method for model objects that supports this behavior but none for formula objects (see here). Is there any workaround other than fitting a model, updating it and then extracting the formula? In the worst case, I could manipulate the formula as a character string but that feels rather heavy-handed. rlang and modelr don't seem to have any appropriate functions either.


Solution

  • The Formula package provides tools for handling multi-part formulas. @user20650 pointed me to page 9: converting the formula to a Formula before updating:

    library(Formula)
    update(Formula(f), . ~ . - z | .)
    

    or (f |> Formula |> update(...) if you like pipes).


    Previous answer (brute force): ugly but functional.

    g <- f[[3]][[2]]  ## f[[3]] is RHS of formula: 
                      ## f[[3]][[2]] is the stuff before the pipe
    g_u <- update(as.formula(call("~", g)), ~ . - z)
    f[[3]][[2]] <- g_u[[2]]
    f
    ## log(y) ~ log(x) | fe
    

    You could presumably package this into your own my_update() formula that would at least work for a narrow range of use cases:

    my_update <- function(f, new, part = c("left", "right")) {
       part <- match.arg(part)
       ind <- if (part == "left") 2 else 3
       g <- f[[3]][[ind]]
       g_u <- update(as.formula(call("~", g)), new)
       f[[3]][[ind]] <- g_u[[2]]
       f
    }
    

    Note that the new argument should be a one-sided formula:

    my_update(f, ~ . - z)
    ## log(y) ~ log(x) | fe
    my_update(f, ~ . + junk, "right")
    ## log(y) ~ log(x) + z | fe + junk