Search code examples
pythonscikit-learnpipeline

passthrough all columns in sklearn pipeline


I am trying to join the result of a PCA to the original features, to do this I tried a FeatureUnion of the PCA with a column transformer that just passthrough all columns

feature_selector = FeatureUnion(
    [
        ("original", make_column_transformer(('drop', []), reminder='passthrough'),
        ("pca", PCA())
    ])
my_pipeline = make_pipeline(preprocessor, feature_selector, model)

But this seems a bit counter intuitive.
Is there is any cleaner way of doing this? maybe a feature selector that select all columns instead of column transformer?


Solution

  • I think maybe the cleanest approach is to use a FunctionTransformer. Note in particular that the default value of the parameter func gives you an "identity transformer":

    [...] If func is None, then func will be the identity function.