I am trying to print out all of the imputation values after fitting with SimpleImputer
. When using SimpleImputer
by itself, I can retrieve these from the instance's statistics_
attribute.
This works fine:
s = SimpleImputer(strategy='mean')
s.fit(df[['feature_1', 'feature_2']])
print(s.statistics_)
However, I'm unable to do so when using SimpleImputer
in a pipeline.
This does not work:
numeric_transformer = Pipeline(steps=[
('simple_imputer', SimpleImputer(strategy='mean')),
('scaler', StandardScaler())])
categorical_features = ['feature_3']
categorical_transformer = Pipeline(steps=[
('simple_imputer', SimpleImputer(strategy='most_frequent')),
('one_hot', OneHotEncoder(handle_unknown='ignore'))])
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
clf = Pipeline(steps=[('preprocessor', preprocessor),
('classifier', RandomForestClassifier(n_estimators=100))])
clf.fit(df[numeric_features + categorical_features], df['target'])
print(clf.named_steps['preprocessor'].transformers[0][1].named_steps['simple_imputer'].statistics_)
I get the following error:
AttributeError Traceback (most recent call last)
<ipython-input-523-7390eac0d9d6> in <module>
19 clf.fit(df[numeric_features + categorical_features], df['target'])
20
---> 21 print(clf.named_steps['preprocessor'].transformers[0][1].named_steps['simple_imputer'].statistics_)
AttributeError: 'SimpleImputer' object has no attribute 'statistics_
I believe I am grabbing the correct instance of the fitted SimpleImputer
object. Why can't I retrieve its statistics_
attribute to print out the imputation values?
I find it easier to use 'dot' notation when working with sklearn
pipelines, not least because you get autocomplete to help you navigate the structure/attributes of the pipeline. It also has the added bonus (in my opinion), of being more readable.
You can use the following line to access the statistics_
attribute of the SimpleImputer
:
imputation_vals = (
clf
.named_steps
.preprocessor
.named_transformers_
.num
.named_steps
.simple_imputer.statistics_
)