Search code examples
rglmh2ointeraction

How do I specify an interaction in h2o.glm?


Background:
In R, using a linear model, I could write a formula

est <- lm(Y~1+A+B+C+D:A+E:D+E:F+B:A+B:D+C:A+C:D+C:B, data=mydata)

If "Y" happens to be binomial, then I can also write:

est <- glm(Y~1+A+B+C+D:A+E:D+E:F+B:A+B:D+C:A+C:D+C:B, data=mydata, family = binomial)

But... When I go to h2o.glm, I have to use the "x=.., y=.." form.

my_glm.hex <- h2o.glm(y=y_idx,x=x_idx,
                  training_frame = "my_train",
                  validation_frame = "my_valid",
                  model_id = "my_glm.hex",
                  family = "binomial",
                  lambda_search = TRUE,
                  balance_classes = TRUE)

Question:
How do I add a formula that allows me to fit a generalized linear model (glm) with interactions using h2o.glm?

Addendum:
I'm not sure what tags outside of 'r', 'h2o', and 'fitting' should be used here. If you think of something relevant, could you suggest it in comments?


Solution

  • For h2o.glm, all you need to add is an interactions parameter, which takes the form a list of attributes in x whose interactions you want to include. In your case, it might look like:

    # supposing x contains variables A, B, C, etc.
    interacting_variables <- c('B', 'D', 'E', 'F')
    
    my_glm.hex <- h2o.glm(y=y_idx,x=x_idx,
                          training_frame = "my_train",
                          validation_frame = "my_valid",
                          model_id = "my_glm.hex",
                          family = "binomial",
                          lambda_search = TRUE,
                          balance_classes = TRUE,
                          interactions = interacting_variables)
    

    For a list like the one above, all pairwise combinations of the four variables will be computed.

    You can find more on the h2o site.