Search code examples
pythonmachine-learningscikit-learnlinear-regression

sklearn PolynomialFeatures: Is the bias required if LinearRegression generates a y intercept


I'm new to machine learning so I have been playing around with some of the models trying to get a better understanding.

When I create a matrix of features:

X_Poly3 (X_Poly3 = PolynomialFeatures(3))

where X was a matrix with 2 columns, X_Poly3 was generated containing 10 columns:

X1, X2, X1^2, X1.X2, X2^2, X1^3, X1^2.X2, X2^2.X1, X2^3 plus a "bias" column of 1's.

When I fit LinearRegression() to this matrix, I then end up with 10 coefficients PLUS a y-intercept variable.

I thought that the bias column of 1's would act as a multiplier to create the y-intercept but if LinearRegression creates the y-intercept as standard, is the bias column required?

I created a polynomial linear regression model but I ended up with what looks to be 2 variables related to the y-intercept.

import numpy as np
from sklearn.preprocessing import PolynomialFeatures

X = np.arange(6).reshape(3, 2)

poly = PolynomialFeatures(3)
X_Poly3 = poly.fit_transform(X)

from sklearn.linear_model import LinearRegression
y_train = np.arange(3).reshape(3, 1)

regressor = LinearRegression()
regressor.fit(X_Poly3, y_train)

print(regressor.intercept_)
print(regressor.coef_)

Solution

  • No, you should turn off either the bias term of the polynomial transformer or the intercept term of the linear regression.