I am calling the fit_transform()
and transform()
methods on a Pipeline
object, but Python is raising an AttributeError whenever I try to do so. Here is what I'm trying to run, with imports. (Note: train/test splitting has been done already)
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
pipe = Pipeline([('mean_impute', SimpleImputer()),
('norm', StandardScaler()),
('sklearn_lm', LinearRegression())])
pipe.fit_transform(x_train, y_train) #<-- error here
x_transform = pipe.transform(x_test) #<-- and here if previous line is absent
The text of the error is as follows:
AttributeError: This 'Pipeline' has no attribute 'fit_transform'
What went wrong? I'm sure it's something simple.
x_train
and y_train
to make sure they were the same, and that they both had headerssci-kit learn
Documentation for sklearn.pipeline.Pipeline.fit_transform
states that it's "[o]nly valid if the final estimator either implements fit_transform
or fit
and transform
." Wording may be a bit ambiguous, but it means two possibilities: (i) final estimator implements fit_transform
, or (ii) final estimator implements fit
and transform
.
Your final estimator is sklearn.linear_model.LinearRegression
, which implements fit
, but not transform
. This is why the error is raised.