I have a situation where I need to train a regression model that will have 100 features. I want to look for interaction effects between all 100 features and one other feature. I would like to find a way to do this programatically as well since this analysis is going to be recuring and I don't want to have to reprogram a new formula each time this analysis is run. I want it to be automated. So how can I get a model that is like so
Y~a*b + a*c + .... a*z
But for 100 terms? How do I get the R formula to do this? Note I will be using statsmodels in python but I think the syntax is the same.
lm(Y ~ a * ., df)
eg
lm(Sepal.Width ~ Sepal.Length * ., iris)
Call:
lm(formula = Sepal.Width ~ Sepal.Length * ., data = iris)
Coefficients:
(Intercept) Sepal.Length Petal.Length Petal.Width
-0.91350 0.82954 0.29569 0.85334
Speciesversicolor Speciesvirginica Sepal.Length:Petal.Length Sepal.Length:Petal.Width
0.05894 -0.89244 -0.05394 -0.04654
Sepal.Length:Speciesversicolor Sepal.Length:Speciesvirginica
-0.32823 -0.21910