Search code examples
pythonrlinear-regressionlm

rewrite R linear model to Python


Help to rewrite R linear model to Python. R code:

x <- rnorm(10)
y <- 1+x+rnorm(10)
model <- lm(y~x)
res = summary(model)$r.squared
print(res)

Python code throws an error - 'setting an array element with a sequence'. It seems it lacks something, i can't understand what

x = np.random.normal(0, 1, 10)
y = [1 + np.random.normal() + v for v in x]
new_list = [x, y]
array = np.array(new_list)
df = pd.DataFrame({'x': [x], 'y': [y]})
model = LinearRegression()
X, y = df[['x', 'y']], df
model.fit(X, y)

Solution

  • The statsmodels library provides an easy linear regression implementation with a similar summary table as R. You can find the documentation here.

    Python:

    import numpy as np
    import statsmodels.api as sm
    
    x = np.random.normal(0, 1, 10)
    y = [1 + np.random.normal() + v for v in x]
    
    #add intercept to x
    x = sm.add_constant(x)
    
    #statsmodels ordinary linear regression
    model = sm.OLS(y, x)
    results = model.fit()
    
    print(results.summary())