Search code examples
swiftimagefunctiongrayscalecoreml

how to use mlmodel to make predictions on UIImage?


I have been working on a project which involves detecting if a person is happy or sad in an image. I am using a machine learning model for this purpose. I have already converted the python model to .mlmodel and implemented it in the app. The model requires 48x48 grayscale images. I need help as to how to convert my UIImage into this format.

Link to the project:

https://github.com/LOLIPOP-INTELLIGENCE/happy_faces_v1

Any help will be appreciated! Thanks


Solution

  • An efficient way to resize and convert an image to grayscale is to use vImage. See Converting Color Images to Grayscale:

    For example:

    /*
     The Core Graphics image representation of the source asset.
     */
    let cgImage: CGImage = {
        guard let cgImage = #imageLiteral(resourceName: "image.jpg").cgImage else {
            fatalError("Unable to get CGImage")
        }
    
        return cgImage
    }()
    
    /*
     The format of the source asset.
     */
    lazy var format: vImage_CGImageFormat = {
        guard
            let sourceColorSpace = cgImage.colorSpace else {
                fatalError("Unable to get color space")
        }
    
        return vImage_CGImageFormat(
            bitsPerComponent: UInt32(cgImage.bitsPerComponent),
            bitsPerPixel: UInt32(cgImage.bitsPerPixel),
            colorSpace: Unmanaged.passRetained(sourceColorSpace),
            bitmapInfo: cgImage.bitmapInfo,
            version: 0,
            decode: nil,
            renderingIntent: cgImage.renderingIntent)
    }()
    
    /*
     The vImage buffer containing a scaled down copy of the source asset.
     */
    lazy var sourceBuffer: vImage_Buffer = {
        var sourceImageBuffer = vImage_Buffer()
    
        vImageBuffer_InitWithCGImage(&sourceImageBuffer,
                                     &format,
                                     nil,
                                     cgImage,
                                     vImage_Flags(kvImageNoFlags))
    
        var scaledBuffer = vImage_Buffer()
    
        vImageBuffer_Init(&scaledBuffer,
                          48,
                          48,
                          format.bitsPerPixel,
                          vImage_Flags(kvImageNoFlags))
    
        vImageScale_ARGB8888(&sourceImageBuffer,
                             &scaledBuffer,
                             nil,
                             vImage_Flags(kvImageNoFlags))
    
        return scaledBuffer
    }()
    
    /*
     The 1-channel, 8-bit vImage buffer used as the operation destination.
     */
    lazy var destinationBuffer: vImage_Buffer = {
        var destinationBuffer = vImage_Buffer()
    
        vImageBuffer_Init(&destinationBuffer,
                          sourceBuffer.height,
                          sourceBuffer.width,
                          8,
                          vImage_Flags(kvImageNoFlags))
    
        return destinationBuffer
    }()
    

    Note, I changed Apple’s sample where it called vImageBuffer_Init to force the 48×48 size.

    And then:

    // Declare the three coefficients that model the eye's sensitivity
    // to color.
    let redCoefficient: Float = 0.2126
    let greenCoefficient: Float = 0.7152
    let blueCoefficient: Float = 0.0722
    
    // Create a 1D matrix containing the three luma coefficients that
    // specify the color-to-grayscale conversion.
    let divisor: Int32 = 0x1000
    let fDivisor = Float(divisor)
    
    var coefficientsMatrix = [
        Int16(redCoefficient * fDivisor),
        Int16(greenCoefficient * fDivisor),
        Int16(blueCoefficient * fDivisor)
    ]
    
    // Use the matrix of coefficients to compute the scalar luminance by
    // returning the dot product of each RGB pixel and the coefficients
    // matrix.
    let preBias: [Int16] = [0, 0, 0, 0]
    let postBias: Int32 = 0
    
    vImageMatrixMultiply_ARGB8888ToPlanar8(&sourceBuffer,
                                           &destinationBuffer,
                                           &coefficientsMatrix,
                                           divisor,
                                           preBias,
                                           postBias,
                                           vImage_Flags(kvImageNoFlags))
    
    // Create a 1-channel, 8-bit grayscale format that's used to
    // generate a displayable image.
    var monoFormat = vImage_CGImageFormat(
        bitsPerComponent: 8,
        bitsPerPixel: 8,
        colorSpace: Unmanaged.passRetained(CGColorSpaceCreateDeviceGray()),
        bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.none.rawValue),
        version: 0,
        decode: nil,
        renderingIntent: .defaultIntent)
    
    // Create a Core Graphics image from the grayscale destination buffer.
    let result = vImageCreateCGImageFromBuffer(
        &destinationBuffer,
        &monoFormat,
        nil,
        nil,
        vImage_Flags(kvImageNoFlags),
        nil)
    
    // Display the grayscale result.
    if let result = result {
        imageView.image = UIImage(cgImage: result.takeRetainedValue())
    }
    

    Now, that assumes that the original images were already square. If not, you could crop the image before you create your vImage_Buffer source:

    lazy var sourceBuffer: vImage_Buffer = {
        var sourceImageBuffer = vImage_Buffer()
    
        let width = min(cgImage.width, cgImage.height)
        let rect = CGRect(x: (cgImage.width - width) / 2,
                          y: (cgImage.height - width) / 2,
                          width: width,
                          height: width)
        let croppedImage = cgImage.cropping(to: rect)!
    
        vImageBuffer_InitWithCGImage(&sourceImageBuffer,
                                     &format,
                                     nil,
                                     croppedImage,
                                     vImage_Flags(kvImageNoFlags))
    
        var scaledBuffer = vImage_Buffer()
    
        vImageBuffer_Init(&scaledBuffer,
                          48,
                          48,
                          format.bitsPerPixel,
                          vImage_Flags(kvImageNoFlags))
    
        vImageScale_ARGB8888(&sourceImageBuffer,
                             &scaledBuffer,
                             nil,
                             vImage_Flags(kvImageNoFlags))
    
        return scaledBuffer
    }()