Search code examples
coremlcoremltoolsapple-visionmlmodel

Can VNImageRequestHandler accepts MLMultiArray as an input? (Without converting to UIImage)


I have two MLModels in my app. The first one is generating an MLMultiArray output which is meant to be used as the second model input.
As I'm trying to make things as performance-best as possible. I was thinking about using VNImageRequestHandler to feed it with the first model output (MLMultiArray) and use Vision resize and rectOfIntersent to avoid converting the first input to an image, crop features, to avoid the need to convert the first output to image, do everything manually and use the regular image initializer.

Something like that:

   let request = VNCoreMLRequest(model: mlModel) { (request, error) in
        // handle logic?
    }
    
    request.regionOfInterest = // my region

    let handler = VNImageRequestHandler(multiArray: myFirstModelOutputMultiArray)

Or I have to go through back and forth conversions? Trying to reduce processing delays.


Solution

  • Vision uses images (hence the name ;-) ). If you don't want to use images, you need to use the Core ML API directly.

    If the output from the first model really is an image, it's easiest to change that model's output type to an image so that you get a CVPixelBuffer instead of an MLMultiArray. Then you can directly pass this CVPixelBuffer into the next model using Vision.