Background:
In R, using a linear model, I could write a formula
est <- lm(Y~1+A+B+C+D:A+E:D+E:F+B:A+B:D+C:A+C:D+C:B, data=mydata)
If "Y" happens to be binomial, then I can also write:
est <- glm(Y~1+A+B+C+D:A+E:D+E:F+B:A+B:D+C:A+C:D+C:B, data=mydata, family = binomial)
But... When I go to h2o.glm, I have to use the "x=.., y=.." form.
my_glm.hex <- h2o.glm(y=y_idx,x=x_idx,
training_frame = "my_train",
validation_frame = "my_valid",
model_id = "my_glm.hex",
family = "binomial",
lambda_search = TRUE,
balance_classes = TRUE)
Question:
How do I add a formula that allows me to fit a generalized linear model (glm) with interactions using h2o.glm?
Addendum:
I'm not sure what tags outside of 'r', 'h2o', and 'fitting' should be used here. If you think of something relevant, could you suggest it in comments?
For h2o.glm, all you need to add is an interactions
parameter, which takes the form a list of attributes in x
whose interactions you want to include. In your case, it might look like:
# supposing x contains variables A, B, C, etc.
interacting_variables <- c('B', 'D', 'E', 'F')
my_glm.hex <- h2o.glm(y=y_idx,x=x_idx,
training_frame = "my_train",
validation_frame = "my_valid",
model_id = "my_glm.hex",
family = "binomial",
lambda_search = TRUE,
balance_classes = TRUE,
interactions = interacting_variables)
For a list like the one above, all pairwise combinations of the four variables will be computed.
You can find more on the h2o site.