Feed multiple images to CoreML image classification model (swift)

I know how to use the CoreML library to train a model and use it. However, I was wondering if it's possible to feed the model more than one image in order for it to identify it with better accuracy.

The reason for this is because i'm a trying to build an app that classifies histological slides, however, many of them look quite similar, so I thought maybe I could feed the model images at different magnifications in order to make the identification. Is it possible?

Thank you, Mehdi

Solution

Yes, this is a common technique. You can give Core ML the images at different scales or use different crops from the same larger image.

A typical approach is to take 4 corner crops and 1 center crop, and also horizontally flip these, so you have 10 images total. Then feed these to Core ML as a batch. (Maybe in your case it makes sense to also vertically flip the crops.)

To get the final prediction, take the average of the predicted probabilities for all images.

Note that in order to use images at different sizes, the model must be configured to support "size flexibility". And it must also be trained on images of different sizes to get good results.