Using lm() in R I can do the following
fit <- lm(organ_volumes~sex+genotype, data=factors)
where organ volumes is a matrix where each column is a different variable. Each column in turn is fit to a linear model as described in the lm docs:
If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix.
Is there any way to do something similar in Python using statsmodels rather than having to loop over each column, which is much slower than the R method?
You can try the following in scikit, just note that sometimes for correlated dependent variables, the output is different from R:
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(data= iris['data'],
columns= iris['feature_names'] )
from sklearn import linear_model
clf = linear_model.LinearRegression()
X = df[['sepal length (cm)','sepal width (cm)']]
Y = df[['petal length (cm)','petal width (cm)']]
clf.fit(X,Y)
clf.coef_
array([[ 1.77559255, -1.33862329],
[ 0.723292 , -0.47872132]])
In R:
data = as.matrix(iris[,-5])
lm(data[,c(1,3)] ~ data[,c(2,4)])
Call:
lm(formula = data[, c(1, 3)] ~ data[, c(2, 4)])
Coefficients:
Sepal.Length Petal.Length
(Intercept) 3.4573 2.2582
data[, c(2, 4)]Sepal.Width 0.3991 -0.3550
data[, c(2, 4)]Petal.Width 0.9721 2.1556