Search code examples
pymc3

Hierarchical modeling categorical variable interactions in PyMC3


I'm attempting to use PyMC3 to implement a hierarchical model with categorical variables and their interactions. In R, the formula would take the form of something like:

y ~ x1 + x2 + x1:x2

However, on the tutorial https://pymc-devs.github.io/pymc3/GLM-hierarchical/#partial-pooling-hierarchical-regression-aka-the-best-of-both-worlds they explicitly say that glm doesn't play nice with hierarchical modeling yet.

So how would I go about adding the x1:x2 term? Would it be a categorical variable with two categorical parents (x1 and x2)?


Solution

  • You can just manually add the interaction term to your linear model. You would have to add 3 regression coefficients (betas) and one intercept. You can then estimate your y with a likelihood as follows:

    y = pm.Normal('regression', 
                   mu=intercept + beta_x1 * data_x1 + beta_x2 * data_x2 + beta_interaction * data_x1 * data_x2, 
                   sd=sigma, 
                   observed=data_y)
    

    The parameters themselves can all have hyperpriors to build a hierarchical model.