I've created a .mlmodel file based on a custom PyTorch CNN model by converting the PyTorch model first to ONNX and then to CoreML using onnx_coreml. Using dummy data (a 3 x 224 x 224 array where every single value is 1.0), I've verified that the PyTorch model, the ONNX model (run using the Caffe backend) and the CoreML model (using coremltools) all yield identical results.
However, when I import the same model into Xcode and run it on a phone, even using dummy data, the model outputs do not match up.
The device I'm using does not seem to make a difference (I've tried on iPhones ranging from the XS Max all the way down to an SE). All are running iOS 12.2, and using Xcode 10.2.1
Here's the code (in Swift) I'm using to create the dummy data and get a prediction from my model:
let pixelsWide = Int(newImg.size.width)
let pixelsHigh = Int(newImg.size.height)
var pixelMLArray = try MLMultiArray(shape: [1, 1, 3, 224, 224], dataType: .float32)
for y in 0 ..< pixelsHigh {
for x in 0 ..< pixelsWide {
pixelMLArray[[0,0,0,x,y] as [NSNumber]] = 1.0
pixelMLArray[[0,0,1,x,y] as [NSNumber]] = 1.0
pixelMLArray[[0,0,2,x,y] as [NSNumber]] = 1.0
}
}
do {
let convModel = CNNModel()
var thisConvOutput = try convModel.prediction(_0: pixelMLArray)._1161
} catch { print("Error") }
I've verified that the input and output tags are correct, etc. etc. This runs smoothly, but the first three values of thisConvOutput are: [0.000139, 0.000219, 0.003607]
For comparison, the first three values running the PyTorch model are: [0.0002148, 0.00032246, and 0.0035419]
And the exact same .mlmodel using coremltools: [0.00021577, 0.00031877, 0.0035404]
Long story short, not being experienced with Swift, I'm wondering whether I'm doing something stupid in initializing / populating my "pixelMLArray" to run it through the model in Xcode on my device, since the .mlmodel results from coremltools are extremely close to the results I get using PyTorch. Can anyone help?
Your Core ML output on device: [0.000139, 0.000219, 0.003607]
Your output from coremltools: [0.00021577, 0.00031877, 0.0035404]
Note that these are very small numbers. When Core ML runs your model on the GPU (and possibly on the Neural Engine, not sure) it uses 16-bit floating point. These have much smaller precision than 32-bit floating point.
Note how 0.000139 and 0.00021577 are not the same number but they are both around 1e-4. This is below the precision limit of 16-bit floats. But 0.003607 and 0.0035404 are almost the same number, because they're about 10x larger and therefore don't lose as much precision.
Try running your Core ML model on the device using the CPU (you can pass an option for this when instantiating your model). You'll probably see that you now get results that are much closer (and probably identical) to the coremltools version, because Core ML on the CPU uses 32-bit floats.
Conclusion: from what you've shown so far, it looks like your model is working as expected, taking into consideration you will lose precision due to the computations happening with 16-bit floating points.