Search code examples
iosswiftmachine-learningcoreml

Does VNCoreMLFeatureValueObservation output softmax probabilities? If so, how to extract top values?


I'm trying to get my first image classification model working, however, VNClassificationObservation isn't working, while VNCoreMLFeatureValueObservation is.

Here is some info about my model:

MLModelDescription: MLModelDescription inputDescriptionsByName: {

"input_1__0" = "input_1__0 : Image (Color, 299 x 299)";

} outputDescriptionsByName: {

    "output_node0__0" = "output_node0__0 : MultiArray (MLMultiArrayDataTypeDouble, 43)";

} predictedFeatureName: (null) 

According to the docs:

VNClassificationObservation

This type of observation results from performing a VNCoreMLRequest image
analysis with a Core ML model whose role is classification (rather than 
prediction or image-to-image processing).
Vision infers that an MLModel object is a classifier model if that model 
predicts a single feature. 
That is, the model's modelDescription object has a non-nil value for its 
predictedFeatureName property.

At first I assumed when the docs says "prediction", they are referring to regression type model with a value prediction. But now I'm thinking they are referring to softmax prediction probabilities? Thus VNClassificationObservation doesn't output softmax prediction probabilities.

Now,

VNCoreMLFeatureValueObservation:

Overview
This type of observation results from performing a VNCoreMLRequest image analysis with a Core ML model whose role is prediction rather than classification or image-to-image processing.

Vision infers that an MLModel object is a predictor model if that model predicts multiple features. You can tell that a model predicts multiple features when its modelDescription object has a nil value for its predictedFeatureName property, or when it inserts its output in an outputDescriptionsByName dictionary.

I'm confused by the wording. Does this mean multiple input, multi-output model? Not classification, but prediction, is a also little confusing, but I'm assuming softmax probs due to the output I'm getting.

When I run the code below I get:

let request = VNCoreMLRequest(model: model) { [weak self] request, error in
            guard let results = request.results as? [VNCoreMLFeatureValueObservation],
                let topResult = results.first else {
                    fatalError("unexpected result type from VNCoreMLRequest")
DispatchQueue.main.async { [weak self] in

                print("topResult!", topResult)

                //print(model.debugDescription.outputDescriptionsByName)
            }
        }
    let handler = VNImageRequestHandler(ciImage: image)

    DispatchQueue.global(qos: .userInteractive).async {

        do {

            try handler.perform([request])

        } catch {print(error)}

I get back bunch of values:

topResult! Optional(<VNCoreMLFeatureValueObservation: 
0x1c003f0c0> C99BC0A0-7722-4DDC-8FB8-C0FEB1CEEFA5 1 "MultiArray : Double 43 
vector

[ 0.02323521859943867,0.03784361109137535,0.0327669121325016,0.02373981475830078,0.01920632272958755,0.01511944644153118,0.0268220379948616,0.00990589614957571,0.006585873663425446,0.02727104164659977,0.02337176166474819,0.0177282840013504,0.01582957617938519,0.01962342299520969,0.0335112139582634,0.01197215262800455,0.04638960584998131,0.0546870082616806,0.008390620350837708,0.02519697323441505,0.01038128975778818,0.02463733218610287,0.05725555866956711,0.02852404117584229,0.01987413503229618,0.02478211745619774,0.01224409975111485,0.03397252038121223,0.02300941571593285,0.02020683139562607,0.03740271925926208,0.01999092660844326,0.03210178017616272,0.02830206602811813,0.01122485008090734,0.01071082800626755,0.02285266295075417,0.01730070635676384,0.009790488518774509,0.01149104069918394,0.03331543132662773,0.01211327593773603,0.0193191897124052]" (1.000000))

If these are indeed softmax probabilities, how would I go about getting the index for max value? I can't seem to use .count or similar array methods.

I tried to cast it as an array, but both of these didn't work l

let values  = topResult.featureValue as Array! (Can't convert...coercion)
let values = topResult as Array!

If these are NOT softmax values/probabilities, then would I go about getting the prob. values. I'm trying to get the indices of the top 3 softmax probs.

Thank you.

!!!UPDATE!!!!!!!!:

Attempting this within function: var localPrediction: String? let topResult = results.first?.featureValue.multiArrayValue

 DispatchQueue.main.async { () in
            var max_value : Float32 = 0
            for i in 0..<topResult!.count{
                if max_value < topResult![i].floatValue{
                    max_value = topResult![i].floatValue
                    localPrediction = String(i)}

                                        }

Solution

  • When your model is a classifier, i.e. a NeuralNetworkClassifier in the mlmodel file, then the output is VNClassificationObservation objects.

    When your model is not a classifier, i.e. a NeuralNetwork or NeuralNetworkRegressor then the output is one or more VNCoreMLFeatureValueObservation objects that contain the output from your final layer.

    So if you expect softmax output in the VNCoreMLFeatureValueObservation then you need to make sure your model has a softmax as its final layer.

    To get the index and value of the maximum element, use:

    func argmax(_ array: UnsafePointer<Double>, count: Int) -> (Int, Double) {
      var maxValue: Double = 0
      var maxIndex: vDSP_Length = 0
      vDSP_maxviD(array, 1, &maxValue, &maxIndex, vDSP_Length(count))
      return (Int(maxIndex), maxValue)
    }
    

    To use this, first cast the MLMultiArray's dataPointer to an UnsafePointer<Double> and then call the argmax() function:

    let featurePointer = UnsafePointer<Double>(OpaquePointer(features.dataPointer))
    let (maxIndex, maxValue) = argmax(featurePointer, 43)