how to use mlmodel to make predictions on UIImage?

I have been working on a project which involves detecting if a person is happy or sad in an image. I am using a machine learning model for this purpose. I have already converted the python model to .mlmodel and implemented it in the app. The model requires 48x48 grayscale images. I need help as to how to convert my UIImage into this format.

Link to the project:

https://github.com/LOLIPOP-INTELLIGENCE/happy_faces_v1

Any help will be appreciated! Thanks

Solution

An efficient way to resize and convert an image to grayscale is to use vImage. See Converting Color Images to Grayscale:

For example:

/*
 The Core Graphics image representation of the source asset.
 */
let cgImage: CGImage = {
    guard let cgImage = #imageLiteral(resourceName: "image.jpg").cgImage else {
        fatalError("Unable to get CGImage")
    }

    return cgImage
}()

/*
 The format of the source asset.
 */
lazy var format: vImage_CGImageFormat = {
    guard
        let sourceColorSpace = cgImage.colorSpace else {
            fatalError("Unable to get color space")
    }

    return vImage_CGImageFormat(
        bitsPerComponent: UInt32(cgImage.bitsPerComponent),
        bitsPerPixel: UInt32(cgImage.bitsPerPixel),
        colorSpace: Unmanaged.passRetained(sourceColorSpace),
        bitmapInfo: cgImage.bitmapInfo,
        version: 0,
        decode: nil,
        renderingIntent: cgImage.renderingIntent)
}()

/*
 The vImage buffer containing a scaled down copy of the source asset.
 */
lazy var sourceBuffer: vImage_Buffer = {
    var sourceImageBuffer = vImage_Buffer()

    vImageBuffer_InitWithCGImage(&sourceImageBuffer,
                                 &format,
                                 nil,
                                 cgImage,
                                 vImage_Flags(kvImageNoFlags))

    var scaledBuffer = vImage_Buffer()

    vImageBuffer_Init(&scaledBuffer,
                      48,
                      48,
                      format.bitsPerPixel,
                      vImage_Flags(kvImageNoFlags))

    vImageScale_ARGB8888(&sourceImageBuffer,
                         &scaledBuffer,
                         nil,
                         vImage_Flags(kvImageNoFlags))

    return scaledBuffer
}()

/*
 The 1-channel, 8-bit vImage buffer used as the operation destination.
 */
lazy var destinationBuffer: vImage_Buffer = {
    var destinationBuffer = vImage_Buffer()

    vImageBuffer_Init(&destinationBuffer,
                      sourceBuffer.height,
                      sourceBuffer.width,
                      8,
                      vImage_Flags(kvImageNoFlags))

    return destinationBuffer
}()

Note, I changed Apple’s sample where it called vImageBuffer_Init to force the 48×48 size.

And then:

// Declare the three coefficients that model the eye's sensitivity
// to color.
let redCoefficient: Float = 0.2126
let greenCoefficient: Float = 0.7152
let blueCoefficient: Float = 0.0722

// Create a 1D matrix containing the three luma coefficients that
// specify the color-to-grayscale conversion.
let divisor: Int32 = 0x1000
let fDivisor = Float(divisor)

var coefficientsMatrix = [
    Int16(redCoefficient * fDivisor),
    Int16(greenCoefficient * fDivisor),
    Int16(blueCoefficient * fDivisor)
]

// Use the matrix of coefficients to compute the scalar luminance by
// returning the dot product of each RGB pixel and the coefficients
// matrix.
let preBias: [Int16] = [0, 0, 0, 0]
let postBias: Int32 = 0

vImageMatrixMultiply_ARGB8888ToPlanar8(&sourceBuffer,
                                       &destinationBuffer,
                                       &coefficientsMatrix,
                                       divisor,
                                       preBias,
                                       postBias,
                                       vImage_Flags(kvImageNoFlags))

// Create a 1-channel, 8-bit grayscale format that's used to
// generate a displayable image.
var monoFormat = vImage_CGImageFormat(
    bitsPerComponent: 8,
    bitsPerPixel: 8,
    colorSpace: Unmanaged.passRetained(CGColorSpaceCreateDeviceGray()),
    bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.none.rawValue),
    version: 0,
    decode: nil,
    renderingIntent: .defaultIntent)

// Create a Core Graphics image from the grayscale destination buffer.
let result = vImageCreateCGImageFromBuffer(
    &destinationBuffer,
    &monoFormat,
    nil,
    nil,
    vImage_Flags(kvImageNoFlags),
    nil)

// Display the grayscale result.
if let result = result {
    imageView.image = UIImage(cgImage: result.takeRetainedValue())
}

Now, that assumes that the original images were already square. If not, you could crop the image before you create your vImage_Buffer source:

lazy var sourceBuffer: vImage_Buffer = {
    var sourceImageBuffer = vImage_Buffer()

    let width = min(cgImage.width, cgImage.height)
    let rect = CGRect(x: (cgImage.width - width) / 2,
                      y: (cgImage.height - width) / 2,
                      width: width,
                      height: width)
    let croppedImage = cgImage.cropping(to: rect)!

    vImageBuffer_InitWithCGImage(&sourceImageBuffer,
                                 &format,
                                 nil,
                                 croppedImage,
                                 vImage_Flags(kvImageNoFlags))

    var scaledBuffer = vImage_Buffer()

    vImageBuffer_Init(&scaledBuffer,
                      48,
                      48,
                      format.bitsPerPixel,
                      vImage_Flags(kvImageNoFlags))

    vImageScale_ARGB8888(&sourceImageBuffer,
                         &scaledBuffer,
                         nil,
                         vImage_Flags(kvImageNoFlags))

    return scaledBuffer
}()