Search code examples
iosswiftgpuimageopentoktokbox

How do I handle GPUImage image buffers so that they're usable with things like Tokbox?


I'm using OpenTok and replaced their Publisher with my own subclassed version which incorporates GPUImage. My goal is to add filters.

The application builds and runs, but crashes here:

   func willOutputSampleBuffer(sampleBuffer: CMSampleBuffer!) {
        let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
        CVPixelBufferLockBaseAddress(imageBuffer!, 0)
        videoFrame?.clearPlanes()
        for var i = 0 ; i < CVPixelBufferGetPlaneCount(imageBuffer!); i++ {
            print(i)
            videoFrame?.planes.addPointer(CVPixelBufferGetBaseAddressOfPlane(imageBuffer!, i))
        }
        videoFrame?.orientation = OTVideoOrientation.Left
        videoCaptureConsumer.consumeFrame(videoFrame) //comment this out to stop app from crashing. Otherwise, it crashes here.
        CVPixelBufferUnlockBaseAddress(imageBuffer!, 0)
    }

If I comment that line out, I'm able to run the app without crashing. In fact, I see the filter being applied correctly, but it's flickering. Nothings get published to Opentok.

My entire codebase can be downloaded. Click here to see the specific file: This is the specific file for the class. It's actually pretty easy to run - just do pod install before running it.

Upon inspection, it could be that videoCaptureConsumer is not initialized. Protocol reference

I have no idea what my code means. I translated it directly from this objective C file: Tokbox's sample project


Solution

  • I analyzed both, your Swift-project and the Objective-C-project as well. I figured out, that neither is working.

    With this post, i want to give a first update and show a really working demo of how to use GPU image filters with OpenTok.

    Reasons why your GPUImage filer implementation is not woring with OpenTok

    #1 Multiple target specifications

    let sepia = GPUImageSepiaFilter()
    videoCamera?.addTarget(sepia)
    sepia.addTarget(self.view)        
    videoCamera?.addTarget(self.view) // <-- This is wrong and produces the flickering
    videoCamera?.startCameraCapture()
    

    Two sources try to render into the same view. Makes things flickering...

    Part one is solved. Next up: Why is nothing pubilshed to OpenTok? To find the reason for this, i decided to start with the "working" Objective-C version.

    #2 -The Objective-C original codebase

    The orignal Objective-C version doesnt have the expected functionality. The publishing of the GPUImageVideoCamera to an OpenTok subscriber works but there is no filtering involved. And thats your core requirement. The point is, that adding filters is not as trivial as someone would expect, because of differing image formats and differing mechanisms how to do asynchronous programming.

    So reason #2, why your code isnt working as expected: Your reference codebase for your porting work is not correct. It doesnt allow to put GPU filters in between the Publish - Subscriber pipeline.

    A working Objective-C implementation

    I modified the Objective-C version. Current results are looking like this:

    [![enter image description here][1]][1]

    Its running smoothly.

    Final steps

    This is the full code for the custom Tok publisher. Its basically the original code (TokBoxGPUImagePublisher) from [https://github.com/JayTokBox/TokBoxGPUImage/blob/master/TokBoxGPUImage/ViewController.m][2] with following notable modifications:

    OTVideoFrame gets instantiated with a new format

        ...
        format = [[OTVideoFormat alloc] init];
        format.pixelFormat = OTPixelFormatARGB;
        format.bytesPerRow = [@[@(imageWidth * 4)] mutableCopy];
        format.imageWidth = imageWidth;
        format.imageHeight = imageHeight;
        videoFrame = [[OTVideoFrame alloc] initWithFormat: format];
        ...
    

    Replace the WillOutputSampleBuffer callback mechanism

    This callback only triggers, when sample buffers coming directly from the GPUImageVideoCamera are ready and NOT from your custom filters. GPUImageFilters don't provide such a callback / delegate mechanism. Thats why we put an GPUImageRawDataOutput in between and ask it for ready images. This pipeline is implemented in the initCapture method and looks like this:

        videoCamera = [[GPUImageVideoCamera alloc] initWithSessionPreset:AVCaptureSessionPreset640x480 cameraPosition:AVCaptureDevicePositionBack];
    
        videoCamera.outputImageOrientation = UIInterfaceOrientationPortrait;
        sepiaImageFilter = [[GPUImageSepiaFilter alloc] init];
        [videoCamera addTarget:sepiaImageFilter];
        // Create rawOut
        CGSize size = CGSizeMake(imageWidth, imageHeight);
        rawOut = [[GPUImageRawDataOutput alloc] initWithImageSize:size resultsInBGRAFormat:YES];
    
        // Filter into rawOut
        [sepiaImageFilter addTarget:rawOut];
        // Handle filtered images
        // We need a weak reference here to avoid a strong reference cycle.
        __weak GPUImageRawDataOutput* weakRawOut = self->rawOut;
        __weak OTVideoFrame* weakVideoFrame = self->videoFrame;
        __weak id<OTVideoCaptureConsumer> weakVideoCaptureConsumer = self.videoCaptureConsumer;
        //
        [rawOut setNewFrameAvailableBlock:^{
            [weakRawOut lockFramebufferForReading];
            // GLubyte is an uint8_t
            GLubyte* outputBytes = [weakRawOut rawBytesForImage];
    
    
            // About the video formats used by OTVideoFrame
            // --------------------------------------------
            // Both YUV video formats (i420, NV12) have the (for us) following important properties:
            //
            //  - Two planes
            //  - 8 bit Y plane
            //  - 8 bit 2x2 subsampled U and V planes (1/4 the pixels of the Y plane)
            //      --> 12 bits per pixel
            //
            // Further reading: www.fourcc.org/yuv.php
            //
            [weakVideoFrame clearPlanes];
            [weakVideoFrame.planes addPointer: outputBytes];
            [weakVideoCaptureConsumer consumeFrame: weakVideoFrame];
            [weakRawOut unlockFramebufferAfterReading];
        }];
        [videoCamera addTarget:self.view];
        [videoCamera startCameraCapture];
    

    Whole code (The really important thing is initCapture)

    //
    
    //  TokBoxGPUImagePublisher.m
    
    //  TokBoxGPUImage
    
    //
    
    //  Created by Jaideep Shah on 9/5/14.
    
    //  Copyright (c) 2014 Jaideep Shah. All rights reserved.
    
    //
    
    
    
    #import "TokBoxGPUImagePublisher.h"
    
    #import "GPUImage.h"
    
    static size_t imageHeight = 480;
    
    static size_t imageWidth = 640;
    
    
    
    
    
    @interface TokBoxGPUImagePublisher() <GPUImageVideoCameraDelegate, OTVideoCapture> {
    
        GPUImageVideoCamera *videoCamera;
    
        GPUImageSepiaFilter *sepiaImageFilter;
    
        OTVideoFrame* videoFrame;
    
        GPUImageRawDataOutput* rawOut;
    
        OTVideoFormat* format;
    
    }
    
    
    
    @end
    
    
    
    @implementation TokBoxGPUImagePublisher
    
    
    
    @synthesize videoCaptureConsumer ;  // In OTVideoCapture protocol
    
    
    
    - (id)initWithDelegate:(id<OTPublisherDelegate>)delegate name:(NSString*)name
    
    {
    
        self = [super initWithDelegate:delegate name:name];
    
        if (self)
    
        {
    
            self.view = [[GPUImageView alloc] initWithFrame:CGRectMake(0, 0, 1, 1)];
    
            [self setVideoCapture:self];
    
            format = [[OTVideoFormat alloc] init];
            format.pixelFormat = OTPixelFormatARGB;
            format.bytesPerRow = [@[@(imageWidth * 4)] mutableCopy];
            format.imageWidth = imageWidth;
            format.imageHeight = imageHeight;
            videoFrame = [[OTVideoFrame alloc] initWithFormat: format];
        }
    
        return self;
    
    }
    
    #pragma mark GPUImageVideoCameraDelegate
    
    
    
    - (void)willOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
    
    {
    
        CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    
        CVPixelBufferLockBaseAddress(imageBuffer, 0);
    
    
    
        [videoFrame clearPlanes];
    
    
    
        for (int i = 0; i < CVPixelBufferGetPlaneCount(imageBuffer); i++) {
    
            [videoFrame.planes addPointer:CVPixelBufferGetBaseAddressOfPlane(imageBuffer, i)];
    
        }
    
        videoFrame.orientation = OTVideoOrientationLeft;
    
    
    
        [self.videoCaptureConsumer consumeFrame:videoFrame];
    
    
    
        CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
    
    }
    
    #pragma mark OTVideoCapture
    
    
    
    - (void) initCapture {
        videoCamera = [[GPUImageVideoCamera alloc] initWithSessionPreset:AVCaptureSessionPreset640x480
    
                                                          cameraPosition:AVCaptureDevicePositionBack];
    
        videoCamera.outputImageOrientation = UIInterfaceOrientationPortrait;
        sepiaImageFilter = [[GPUImageSepiaFilter alloc] init];
        [videoCamera addTarget:sepiaImageFilter];
        // Create rawOut
        CGSize size = CGSizeMake(imageWidth, imageHeight);
        rawOut = [[GPUImageRawDataOutput alloc] initWithImageSize:size resultsInBGRAFormat:YES];
    
        // Filter into rawOut
        [sepiaImageFilter addTarget:rawOut];
        // Handle filtered images
        // We need a weak reference here to avoid a strong reference cycle.
        __weak GPUImageRawDataOutput* weakRawOut = self->rawOut;
        __weak OTVideoFrame* weakVideoFrame = self->videoFrame;
        __weak id<OTVideoCaptureConsumer> weakVideoCaptureConsumer = self.videoCaptureConsumer;
        //
        [rawOut setNewFrameAvailableBlock:^{
            [weakRawOut lockFramebufferForReading];
            // GLubyte is an uint8_t
            GLubyte* outputBytes = [weakRawOut rawBytesForImage];
    
    
            // About the video formats used by OTVideoFrame
            // --------------------------------------------
            // Both YUV video formats (i420, NV12) have the (for us) following important properties:
            //
            //  - Two planes
            //  - 8 bit Y plane
            //  - 8 bit 2x2 subsampled U and V planes (1/4 the pixels of the Y plane)
            //      --> 12 bits per pixel
            //
            // Further reading: www.fourcc.org/yuv.php
            //
            [weakVideoFrame clearPlanes];
            [weakVideoFrame.planes addPointer: outputBytes];
            [weakVideoCaptureConsumer consumeFrame: weakVideoFrame];
            [weakRawOut unlockFramebufferAfterReading];
        }];
        [videoCamera addTarget:self.view];
        [videoCamera startCameraCapture];
    }
    
    
    
    - (void)releaseCapture
    
    {
    
        videoCamera.delegate = nil;
    
        videoCamera = nil;
    
    }
    
    - (int32_t) startCapture {
    
        return 0;
    
    }
    
    
    
    - (int32_t) stopCapture {
    
        return 0;
    
    }
    
    
    
    - (BOOL) isCaptureStarted {
    
        return YES;
    
    }
    
    - (int32_t)captureSettings:(OTVideoFormat*)videoFormat {
    
        videoFormat.pixelFormat = OTPixelFormatNV12;
    
        videoFormat.imageWidth = imageWidth;
    
        videoFormat.imageHeight = imageHeight;
    
        return 0;
    
    }
    
    @end