Why does StringIndexer has no outputCols?

I am using Apache Zeppelin. My anaconda version is conda 4.8.4. and my spark version is:

%spark2.pyspark
spark.version
u'2.3.1.3.0.1.0-187'

When I run my code, it throws followed error:

Exception AttributeError: "'StringIndexer' object has no attribute '_java_obj'" in <object repr() failed> ignored
Fail to execute line 4: indexerFeatures = StringIndexer(inputCols=catColumns, outputCols=catIndexedColumns, handleInvalid="keep")
Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-66369397479549554.py", line 375, in <module>
    exec(code, _zcUserQueryNameSpace)
  File "<stdin>", line 4, in <module>
  File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/__init__.py", line 105, in wrapper
    return func(self, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'outputCols'

I ran the same code in Databricks and everything worked fine. I also checked the import for the StringIndexer with the help() function and it didn't included the outputCols argument.

Solution

It should be outputCol, not outputCols.

For spark 2.3.1, you can refer to: https://spark.apache.org/docs/2.3.1/api/python/pyspark.ml.html#pyspark.ml.feature.StringIndexer

class pyspark.ml.feature.StringIndexer(inputCol=None, outputCol=None, handleInvalid='error', stringOrderType='frequencyDesc')