Search code examples
javascalaapache-sparkapache-spark-ml

How to get classification probabilities from MultilayerPerceptronClassifier?


This seems most related to: How to get the probability per instance in classifications models in spark.mllib

I'm doing a classification task with spark ml, building a MultilayerPerceptronClassifier. Once I build a model, I can get a predicted class given an input vector, but I can't get the probability for each output class. The above listing indicates that NaiveBayesModel supports this functionality as of Spark 1.5.0 (using a predictProbabilities method). I would like to get at this functionality for the MLPC. Is there a way I can hack at it to get my probabilities? Will it be included in 1.6.2?


Solution

  • If you take a look at this line in the MLPC source code, you can see that the MLPC is working from an underlying TopologyModel which provides the .predict method I'm looking for. The MLPC decodes the resulting Vector into a single label.

    I'm able to use the trained MLPC model to create a new TopologyModel using its weights:

    MultilayerPerceptronClassifier trainer = new MultilayerPerceptronClassifier()...;
    MultilayerPerceptronClassificationModel model = trainer.fit(trainingData);
    TopologyModel topoModel = FeedForwardTopology.multiLayerPerceptron(model.layers(), true).getInstance(model.weights());