Search code examples
iosswiftcoremlcoremltoolsmlmodel

Use first MLModel MLMultiArray output as second MLModel MLMultiArray Input


I have two CoreML MLMModels (converted from .pb).
The first model outputs a Float32 3 × 512 × 512 MLMultiArray, which basically describes an image.
The second model input is a Float32 1 × 360 × 640 × 3 MLMultiArray, which is also an image but with a different size.

I know that in theory, I can convert the second model input into an image, and then convert the first model output to an image (post-prediction), resize it, and feed the second model, but It feels not very efficient and there is already a significant delay caused by the models, so I'm trying to improve performance.

Is it possible to "resize"/"reshape"/"transposed" the first model output to match the second model input? I'm using https://github.com/hollance/CoreMLHelpers (by the amazing Matthijs Hollemans) helpers, but I don't really understand how to do it without damaging the data and keeping it as efficient as possible.

Thanks!


Solution

  • You don't have to turn them into images. Some options for using MLMultiArrays instead of images:

    • You could take the 512x512 output from the first model and chop off a portion to make it 360x512, and then pad the other dimension to make it 360x640. But that's probably not what you want. In case it is, you'll have to write to code for this yourself.

    • You can also resize the 512x512 output to 360x640 by hand. To do this you will need to implement a suitable resizing option yourself (probably bilinear interpolation) or convert the data so you can use OpenCV or the vImage framework.

    • Let the model do the above. Add a ResizeBilinearLayer to the model, followed by a PermuteLayer or TransposeLayer to change the order of the dimensions. Now the image will be resized to 360x640 pixels, and the output of the first model is 1x360x640x3. This is easiest if you add these operation to the original model and then let coremltools convert them to the appropriate Core ML layers.