Search code examples
pysparkapache-spark-mllib

requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values


While training MultilayerPerceptronClassifier in Pyspark (version 2.4.5), I am getting the following exception:

requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.

But the code is working fine with RandomForestClassifier, DecisionTreeClassifier, GBTClassifier, and LinearSVC for the same dataset.


Solution

  • I am getting this error due to a mismatch of the number of features and neurons at the input layer.

    • input layer size should be equal to the number of features.
    • output layer size should be equal to the number of classes or class labels.

    For example, In my case number of features are 7 and the class labels are 2. I have used layer list layers = [7, 5, 4, 2] and two intermediate layers of sizes 5 and 4