I am working on a simple denoising POC in SwiftUI where I want to:
I have something working based on dozens of source codes I found online. Based on what I've read, a CoreML model (at least the one I'm using) accepts a CVPixelBuffer and outputs also a CVPixelBuffer. So my idea was to do the following:
(Note that I've read that using the Vision framework, one can input a CGImage directly into the model. I'll try this approach as soon as I'm familiar with what I'm trying to achieve here as I think it is a good exercise.)
As a start, I wanted to skip the step (2) to focus on the conversion problem. What I tried to achieve in the code bellow is:
I'm not a Swift or an Objective-C developer, so I'm pretty sure that I've made at least a few mistakes. I found this code quite complex and I was wondering if there was a better / simpler way to do the same thing?
func convert(input: UIImage) -> UIImage? {
// Input CGImage
guard let cgInput = input.cgImage else {
return nil
}
// Image size
let width = cgInput.width
let height = cgInput.height
let region = CGRect(x: 0, y: 0, width: width, height: height)
// Attributes needed to create the CVPixelBuffer
let attributes = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
// Create the input CVPixelBuffer
var pbInput:CVPixelBuffer? = nil
let status = CVPixelBufferCreate(kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_32ARGB,
attributes as CFDictionary,
&pbInput)
// Sanity check
if status != kCVReturnSuccess {
return nil
}
// Fill the input CVPixelBuffer with the content of the input CGImage
CVPixelBufferLockBaseAddress(pbInput!, CVPixelBufferLockFlags(rawValue: 0))
guard let context = CGContext(data: CVPixelBufferGetBaseAddress(pbInput!),
width: width,
height: height,
bitsPerComponent: cgInput.bitsPerComponent,
bytesPerRow: cgInput.bytesPerRow,
space: cgInput.colorSpace!,
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else {
return nil
}
context.draw(cgInput, in: region)
CVPixelBufferUnlockBaseAddress(pbInput!, CVPixelBufferLockFlags(rawValue: 0))
// Create the output CGImage
let ciOutput = CIImage(cvPixelBuffer: pbInput!)
let temporaryContext = CIContext(options: nil)
guard let cgOutput = temporaryContext.createCGImage(ciOutput, from: region) else {
return nil
}
// Create and return the output UIImage
return UIImage(cgImage: cgOutput)
}
When I used this code in my SwiftUI project, input and output images looked the same, but there were not identical. I think the input image had a colormap (ColorSync Profile) associated to it that have been lost during the conversion. I assumed I was supposed to use cgInput.colorSpace
during the CGContext creation, but it seemed that using CGColorSpace(name: CGColorSpace.sRGB)!
was working better. Can somebody please explain that to me?
Thanks for your help.
You can also use CGImage
objects with Core ML, but you have to create the MLFeatureValue
object by hand and then put it into an MLFeatureProvider
to give it to the model. But that only takes care of the model input, not the output.
Another option is to use the code from my CoreMLHelpers repo.