Search code examples
iosswiftcameracore-graphicsavcapturemoviefileoutput

How to superimpose views over each captured frame inside CVImageBuffer, realtime not post process


I have managed to setup a basic AVCaptureSession which records a video and saves it on device by using AVCaptureFileOutputRecordingDelegate. I have been searching through docs to understand how we can add statistics overlays on top of the video which is being recorded.

i.e.

enter image description here

As you can see in the above image. I have multiple overlays on top of video preview layer. Now, when I save my video output I would like to compose those views onto the video as well.

What have I tried so far?

  • Honestly, I am just jumping around on internet to find a reputable blog explaining how one would do this. But failed to find one.
  • I have read few places that one could render text layer overlays as described in following post by creating CALayer and adding it as a sublayer.
  • But, what about if I want to render MapView on top of the video being recorded. Also, I am not looking for screen capture. Some of the content on the screen will not be part of the final recording so I want to be able to cherry pick view that will be composed.

What am I looking for?

  1. Direction.
  2. No straight up solution
  3. Documentation link and class names I should be reading more about to create this.

Progress So Far:

I have managed to understand that I need to get hold of CVImageBuffer from CMSampleBuffer and draw text over it. There are things still unclear to me whether it is possible to somehow overlay MapView over the video that is being recorded.


Solution

  • The best way that helps you to achieve your goal is to use a Metal framework. Using a Metal camera is good for minimising the impact on device’s limited computational resources. If you are trying to achieve the lowest-overhead access to camera sensor, using a AVCaptureSession would be a really good start.

    You need to grab each frame data from CMSampleBuffer (you're right) and then to convert a frame to a MTLTexture. AVCaptureSession will continuously send us frames from device’s camera via a delegate callback.

    All available overlays must be converted to MTLTextures too. Then you can composite all MTLTextures layers with over operation.

    So, here you'll find all necessary info in four-part Metal Camera series.

    And here's a link to a blog: About Compositing in Metal.

    Also, I'd like to publish code's excerpt (working with AVCaptureSession in Metal):

    import Metal
    
    guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
        // Handle an error here.
    }
    
    // Texture cache for converting frame images to textures
    var textureCache: CVMetalTextureCache?
    
    // `MTLDevice` for initializing texture cache
    var metalDevice = MTLCreateSystemDefaultDevice()
    
    guard
        let metalDevice = metalDevice
        where CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, metalDevice, nil, &textureCache) == kCVReturnSuccess
    else {
        // Handle an error (failed to create texture cache)
    }
    
    let width = CVPixelBufferGetWidth(imageBuffer)
    let height = CVPixelBufferGetHeight(imageBuffer)
    
    var imageTexture: CVMetalTexture?
    let result = CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault, textureCache.takeUnretainedValue(), imageBuffer, nil, pixelFormat, width, height, planeIndex, &imageTexture)
    
    // `MTLTexture` is in the `texture` variable now.
    guard
        let unwrappedImageTexture = imageTexture,
        let texture = CVMetalTextureGetTexture(unwrappedImageTexture),
        result == kCVReturnSuccess
    else {
        throw MetalCameraSessionError.failedToCreateTextureFromImage
    }
    

    And here you can find a final project on a GitHub: MetalRenderCamera