Search code examples
iosobjective-cswiftimagecore-media

How to extract pixel data for processing from CMSampleBuffer using Swift in iOS 9?


I am writing an app in Swift which employs the Scandit barcode scanning SDK. The SDK permits you to access camera frames directly and provides the frame as a CMSampleBuffer. They provide documentation in Objective-C, which I am having trouble getting to work in Swift. I do not know if the problem is in porting the code, or if there is something amiss with the sample buffer itself, perhaps due to a change in Core Media since their documentation was generated.

Their API exposes the frame as follows (Objective-C):

interface YourViewController () <SBSProcessFrameDelegate>
...
- (void)barcodePicker:(SBSBarcodePicker*)barcodePicker
      didProcessFrame:(CMSampleBufferRef)frame
              session:(SBSScanSession*)session {
    // Process the frame yourself.
}

Building from several answers here on SO, I attempt to process the frame with:

let imageBuffer = CMSampleBufferGetImageBuffer(frame)!
CVPixelBufferLockBaseAddress(imageBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)

let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)

let colorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.NoneSkipFirst.rawValue | CGBitmapInfo.ByteOrder32Little.rawValue)
let context = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, bitmapInfo.rawValue)

let quartzImage = CGBitmapContextCreateImage(context)
CVPixelBufferUnlockBaseAddress(imageBuffer,0)

let image = UIImage(CGImage: quartzImage!)

But, this fails with:

Jan 29 09:01:30  Scandit[1308] <Error>: CGBitmapContextCreate: invalid data bytes/row: should be at least 7680 for 8 integer bits/component, 3 components, kCGImageAlphaNoneSkipFirst.
Jan 29 09:01:30  Scandit[1308] <Error>: CGBitmapContextCreateImage: invalid context 0x0. If you want to see the backtrace, please set CG_CONTEXT_SHOW_BACKTRACE environmental variable.
fatal error: unexpectedly found nil while unwrapping an Optional value

The fatal error is in attempting to resolve a UIImage from quartzImage.

The width, height, and bytesPerRow are (at the base address):

Width: 1920
Height: 1080
Bytes per row: 2904

As passed from the delegate, here is what the buffer contains according to CMSampleBufferGetFormatDescription(frame):

Optional(<CMVideoFormatDescription 0x1447dafa0 [0x1a1864b68]> {
    mediaType:'vide' 
    mediaSubType:'420f' 
    mediaSpecific: {
        codecType: '420f'       dimensions: 1920 x 1080 
    } 
    extensions: {<CFBasicHash 0x1447dba10 [0x1a1864b68]>{type = immutable dict, count = 6,
entries =>
    0 : <CFString 0x19d28b678 [0x1a1864b68]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x19d28b6b8 [0x1a1864b68]>{contents = "ITU_R_601_4"}
    1 : <CFString 0x19d28b7d8 [0x1a1864b68]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x19d28b698 [0x1a1864b68]>{contents = "ITU_R_709_2"}
    2 : <CFString 0x19d2b65c0 [0x1a1864b68]>{contents = "CVBytesPerRow"} = <CFNumber 0xb00000000000b582 [0x1a1864b68]>{value = +2904, type = kCFNumberSInt32Type}
    3 : <CFString 0x19d2b6640 [0x1a1864b68]>{contents = "Version"} = <CFNumber 0xb000000000000022 [0x1a1864b68]>{value = +2, type = kCFNumberSInt32Type}
    5 : <CFString 0x19d28b758 [0x1a1864b68]>{contents = "CVImageBufferColorPrimaries"} = <CFString 0x19d28b698 [0x1a1864b68]>{contents = "ITU_R_709_2"}
    6 : <CFString 0x19d28b818 [0x1a1864b68]>{contents = "CVImageBufferChromaLocationTopField"} = <CFString 0x19d28b878 [0x1a1864b68]>{contents = "Center"}
}
}
})

I realize there may be multiple "planes" here, but even with:

let pixelBufferBytesPerRow0 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0)
let pixelBufferBytesPerRow1 = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 1)

Gives:

Pixel buffer bytes per row (Plane 0): 1920
Pixel buffer bytes per row (Plane 1): 1920

I don't understand that discrepancy.

I also attempted to process each pixel individually as it is clear the buffer contains some manner of YCbCr, but it fails every way I have tried. The Scandit API suggest (Objective-C):

// Get the buffer info for the YCbCrBiPlanar format.
void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
CVPlanarPixelBufferInfo_YCbCrBiPlanar *bufferInfo = (CVPlanarPixelBufferInfo_YCbCrBiPlanar *)baseAddress;

But, I cannot find a Swift implementation that permits access to the buffer info using CVPlanarPixelBufferInfo... everything I have tried fails, so I am unable to determine the offset for "Y", "Cr", etc.

How can I access the pixel data in the buffer? Is this a problem with the CMSampleBuffer the SDK is passing, a problem with iOS9, or both?


Solution

  • This is not a complete answer, just some hints:

    Scandit uses the YCbCrBiPlanar format. It has a Y byte for each pixel and a Cb and a Cr byte for each group of 2x2 pixels. The Y values are on the first plane, the Cb and Cr values on the second plane.

    If the image is w x h pixels large, then the first plane contains h rows of w bytes (and maybe some padding for each line).

    The second plane contains h / 2 lines of w / 2 pairs of byte. Each pair consists of a Cb and Cr value. Again each line might have some padding at the end.

    So the value of Y for the pixel at position (x, y) can be found at the address:

    Y: baseAddressPlane1 + y * bytesPerRowPlane1 + x

    And the value Cb and Cr for the pixel at position (x, y) can be found at the address:

    Cb: baseAddressPlane2 + (y / 2) * bytesPerRowPlan2 + (x / 2) * 2

    Cr: baseAddressPlane2 + (y / 2) * bytesPerRowPlan2 + (x / 2) * 2 + 1

    The divisions by 2 are integer divisions that discard the fractional part.