I'm currently trying to do some calculations on a CIImage construct. We are using a custom Core ML model on video frames, and in the meantime using GPU to translate these with CIFilters to required formats.
For one step, I need to do some calculations on two of the outputs generated by a model, and find the mean and standart deviations from the pixel data per channel.
For testing and tech preview, I was able to create a UIImage, read CVPixelData, convert and calculate on the CPU. But while trying to adapt it to the GPU I hit a rock.
The process is simple:
At current state, I am using a custom CIFilter + Metal kernel to convert the CIImage from RGB to LAB (and back to RGB) format. Without calculations in between, RGB > LAB > RGB conversion works as expected and returns the same image without any deformations. This tells me that the float precision is not lost.
But when I tried to read the pixel data in between, I'm not able to get the float values I was looking for. CVPixelBuffer created from the LAB formatted CIImage is giving me values that are always zero. Tried a few different OSType formats like kCVPixelFormatType_64RGBAHalf
, kCVPixelFormatType_128RGBAFloat
, kCVPixelFormatType_32ARGB
, etc., none of them are returning the float values. But if I read data from another image I'm always getting the UInt8 values as expected...
So my question is as the title suggest "How can I read the CVPixelBuffer as a 4 channel float format from a CIImage?"
Simplified Swift and Metal code for the process is as follows.
let ciRgbToLab = CIConvertRGBToLAB() // CIFilter using metal for kernel
let ciLabToRgb = CIConvertLABToRGB() // CIFilter using metal for kernel
ciRgbToLab.inputImage = source // "source" is a CIImage
guard let sourceLab = ciRgbToLab.outputImage else { throw ... }
ciRgbToLab.inputImage = target // "target" is a CIImage
guard let targetLab = ciRgbToLab.outputImage { throw ... }
// Get the CVPixelBuffer and lock the data.
guard let sourceBuffer = sourceLab.cvPixelBuffer else { throw ... }
CVPixelBufferLockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
defer {
CVPixelBufferUnlockBaseAddress(sourceBuffer, CVPixelBufferLockFlags(rawValue: 0))
}
// Access to the data
guard let sourceAddress = CVPixelBufferGetBaseAddress(sourceBuffer) { throw ... }
let sourceDataSize = CVPixelBufferGetDataSize(sourceBuffer)
let sourceData = sourceAddress.bindMemory(to: CGFloat.self, capacity: sourceDataSize)
// ... do calculations
// ... generates a new CIImage named "targetTransfered"
ciLabToRgb.inputImage = targetTransfered //*
guard let rgbFinal = ciLabToRgb.outputImage else { throw ... }
//* If "targetTransfered" is replaced with "targetLab", we get the exact image as "target".
#include <metal_stdlib>
using namespace metal;
#include <CoreImage/CoreImage.h>
extern "C" {
namespace coreimage {
float4 xyzToLabConversion(float4 pixel) {
...
return float4(l, a, b, pixel.a);
}
float4 rgbToXyzConversion(float4 pixel) {
...
return float4(x, y, z, pixel.a);
}
float4 rgbToLab(sample_t s) {
float4 xyz = rgbToXyzConversion(s);
float4 lab = xyzToLabConversion(xyz);
return lab;
}
float4 xyzToRgbConversion(float4 pixel) {
...
return float4(R, G, B, pixel.a);
}
float4 labToXyzConversion(float4 pixel) {
...
return float4(X, Y, Z, pixel.a);
}
float4 labtoRgb(sample_t s) {
float4 xyz = labToXyzConversion(s);
float4 rgb = xyzToRgbConversion(xyz);
return rgb;
}
}
}
This is the extension I'm using to convert CIImage to CVPixelBuffer. As the image is created on device by the same source, it is always in BGRA format. I have no idea how to convert this to get float values...
extension CIImage {
var cvPixelBuffer: CVPixelBuffer? {
let attrs = [
kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferMetalCompatibilityKey: kCFBooleanTrue
] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(self.extent.width),
Int(self.extent.height),
kCVPixelFormatType_32BGRA,
attrs,
&pixelBuffer)
guard status == kCVReturnSuccess else { return nil }
guard let buffer = pixelBuffer else { return nil }
CVPixelBufferLockBaseAddress(buffer, CVPixelBufferLockFlags.init(rawValue: 0))
let context = CIContext()
context.render(self, to: buffer)
CVPixelBufferUnlockBaseAddress(buffer, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
}
PS: I removed the metal kernel code to fit in here. If you need a RGB > LAB > RGB conversion, send me a message, I'm happy to share the filter.
It's very strange that you get all zeros, especially when you set the format to kCVPixelFormatType_128RGBAFloat
...
However, I highly recommend you check out CIImageProcessorKernel, it's made for this very use case: adding custom (potentially CPU-based) processing steps to a Core Image pipeline. In the process
function you get access to the input and output buffers either as MTLTexture
, CVPixelBuffer
, or even direct access to the baseAddress
.
Here is an example kernel I wrote for computing the mean and variance of the input image using Metal Performance Shaders and returning them in a 2x1 pixel CIImage
:
import CoreImage
import MetalPerformanceShaders
/// Processing kernel that computes the mean and the variance of a given image and stores
/// those values in a 2x1 pixel return image.
class MeanVarianceKernel: CIImageProcessorKernel {
override class func roi(forInput input: Int32, arguments: [String : Any]?, outputRect: CGRect) -> CGRect {
// we need to read the full extend of the input
return arguments?["inputExtent"] as? CGRect ?? outputRect
}
override class var outputFormat: CIFormat {
return .RGBAf
}
override class var synchronizeInputs: Bool {
// no need to wait for CPU synchronization since the processing is also happening on the GPU
return false
}
/// Convenience method for calling the `apply` method from outside.
class func apply(to input: CIImage) -> CIImage {
// pass the extent of the input as argument since we need to know the full extend in the ROI callback above
return try! self.apply(withExtent: CGRect(x: 0, y: 0, width: 2, height: 1), inputs: [input], arguments: ["inputExtent": input.extent])
}
override class func process(with inputs: [CIImageProcessorInput]?, arguments: [String : Any]?, output: CIImageProcessorOutput) throws {
guard
let commandBuffer = output.metalCommandBuffer,
let input = inputs?.first,
let sourceTexture = input.metalTexture,
let destinationTexture = output.metalTexture
else {
return
}
let meanVarianceShader = MPSImageStatisticsMeanAndVariance(device: commandBuffer.device)
meanVarianceShader.encode(commandBuffer: commandBuffer, sourceTexture: sourceTexture, destinationTexture: destinationTexture)
}
}
It can easily be added to a filter pipeline like this:
let meanVariance: CIImage = MeanVarianceKernel.apply(to: inputImage)