Search code examples
pythonscipylinear-regression

Multi-variable linear regression with scipy linregress


I'm trying to train a very simple linear regression model.

My code is:

from scipy import stats

xs = [[   0,    1,  153]
 [   1,    2,    0]
 [   2,    3,  125]
 [   3,    1,   93]
 [   2,   24, 5851]
 [   3,    1,  524]
 [   4,    1,    0]
 [   2,    3,    0]
 [   2,    1,    0]
 [   5,    1,    0]]

ys = [1, 1, 1, 1, 1, 0, 1, 1, 0, 1]

slope, intercept, r_value, p_value, std_err = stats.linregress(xs, ys)

I'm getting the following error:

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/stats/stats.py", line 3100, in linregress
ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/function_base.py", line 1747, in cov
X = concatenate((X, y), axis)
ValueError: all the input array dimensions except for the concatenation 
axis must match exactly

What's wrong with my input? I've tried changing the structure of ys in several ways but nothing works.


Solution

  • You're looking for multi variable regression. AFAIK stats.linregress does not have that functionality.

    You might want to try sklearn.linear_model.LinearRegression. Check this answer.