In the code below, a PySpark pipeline contains two tranformers. How to print out the names of these two transformers given the pipleline?
from pyspark.ml.feature import (StringIndexer, OneHotEncoder)
from pyspark.ml import Pipeline
gender_indexer = StringIndexer(inputCol = 'Sex', outputCol = 'SexIndex')
gender_encoder = OneHotEncoder(inputCol='SexIndex', outputCol = 'SexVec')
pipeline = Pipeline(stages = [gender_indexer, gender_encoder])
pipeline.getStages()
will show you the stages in the pipeline:
>>> pipeline.getStages()
[StringIndexer_84633f93b8f6, OneHotEncoder_6a01b7a7cdc1]
Note that each list element is an object, not a string.