Search code examples

'OneHotEncoder' object has no attribute 'transform'

I am using Spark v3.0.0. My dataframe is:
|row_id|    city|index|
|     0|New York|  0.0|
|     1|  Moscow|  3.0|
|     2| Beijing|  1.0|
|     3|New York|  0.0|
|     4|   Paris|  2.0|
|     5|   Paris|  2.0|
|     6|New York|  0.0|
|     7| Beijing|  1.0|

Then I want to use One hot encoding of the dataframe's column "index" and getting this error.

encoder = OneHotEncoder(inputCol="index", outputCol="encoding")
indexer = encoder.transform(indexer)
AttributeErrorTraceback (most recent call last)
<ipython-input-32-70bbd67e6679> in <module>
      1 encoder = OneHotEncoder(inputCol="index", outputCol="encoding")
      2 encoder.setDropLast(False)
----> 3 indexer = encoder.transform(indexer)

AttributeError: 'OneHotEncoder' object has no attribute 'transform'


  • You need to fit it first - before fitting, the attribute does not exist indeed:

    encoder = OneHotEncoder(inputCol="index", outputCol="encoding")
    ohe = # indexer is the existing dataframe, see the question
    indexer = ohe.transform(indexer)

    See the example in the docs for more details on the usage.