Search code examples
cocoamacosquicktimeqtkit

What is a 10.6-compatible means of recording video frames to a movie without using the QuickTime API?


I'm updating an application to be 64-bit-compatible, but I'm having a little difficulty with our movie recording code. We have a FireWire camera that feeds YUV frames into our application, which we process and encode out to disk within an MPEG4 movie. Currently, we are using the C-based QuickTime API to do this (using Image Compression Manager, etc.), but the old QuickTime API does not have support for 64 bit.

My first attempt was to use QTKit's QTMovie and encode individual frames using -addImage:forDuration:withAttributes:, but that requires the creation of an NSImage for each frame (which is computationally expensive) and it does not do temporal compression, so it doesn't generate the most compact files.

I'd like to use something like QTKit Capture's QTCaptureMovieFileOutput, but I can't figure out how to feed raw frames into that which aren't associated with a QTCaptureInput. We can't use our camera directly with QTKit Capture because of our need to manually control the gain, exposure, etc. for it.

On Lion, we now have the AVAssetWriter class in AVFoundation which lets you do this, but I still have to target Snow Leopard for the time being, so I'm trying to find a solution that works there as well.

Therefore, is there a way to do non-QuickTime frame-by-frame recording of video that is more efficient than QTMovie's -addImage:forDuration:withAttributes: and produces file sizes comparable to what the older QuickTime API can?


Solution

  • In the end, I decided to go with the approach suggested by TiansHUo, and use libavcodec for the video compression here. Based on the instructions by Martin here, I downloaded the FFmpeg source and built a 64-bit compatible version of the necessary libraries using

    ./configure --disable-gpl --arch=x86_64 --cpu=core2 --enable-shared --disable-amd3dnow --enable-memalign-hack --cc=llvm-gcc
    make
    sudo make install
    

    This creates the LGPL shared libraries for the 64-bit Core2 processors in the Mac. Unfortunately, I haven't yet figured a way to make the library run without crashing when the MMX optimizations are enabled, so that is disabled right now. This slows down encoding somewhat. After some experimentation, I found that I could build a 64-bit version of the library which had MMX optimizations enabled and was stable on the Mac by using the above configuration options. This is much faster when encoding than the library built with MMX disabled.

    Note that if you use these shared libraries, you should make sure you follow the LGPL compliance instructions on FFmpeg's site to the letter.

    In order to get these shared libraries to function properly when placed in proper folder within my Mac application bundle, I needed to use install_name_tool to adjust the internal search paths in these libraries to point to their new location in the Frameworks directory within the application bundle:

    install_name_tool -id @executable_path/../Frameworks/libavutil.51.9.1.dylib libavutil.51.9.1.dylib
    
    install_name_tool -id @executable_path/../Frameworks/libavcodec.53.7.0.dylib libavcodec.53.7.0.dylib
    install_name_tool -change /usr/local/lib/libavutil.dylib @executable_path/../Frameworks/libavutil.51.9.1.dylib libavcodec.53.7.0.dylib
    
    install_name_tool -id @executable_path/../Frameworks/libavformat.53.4.0.dylib libavformat.53.4.0.dylib
    install_name_tool -change /usr/local/lib/libavutil.dylib @executable_path/../Frameworks/libavutil.51.9.1.dylib libavformat.53.4.0.dylib
    install_name_tool -change /usr/local/lib/libavcodec.dylib @executable_path/../Frameworks/libavcodec.53.7.0.dylib libavformat.53.4.0.dylib
    
    install_name_tool -id @executable_path/../Frameworks/libswscale.2.0.0.dylib libswscale.2.0.0.dylib
    install_name_tool -change /usr/local/lib/libavutil.dylib @executable_path/../Frameworks/libavutil.51.9.1.dylib libswscale.2.0.0.dylib
    

    Your specific paths may vary. This adjustment lets them work from within the application bundle without having to install them in /usr/local/lib on the user's system.

    I then had my Xcode project link against these libraries, and I created a separate class to handle the video encoding. This class takes in raw video frames (in BGRA format) through the videoFrameToEncode property and encodes them within the movieFileName file as MPEG4 video in an MP4 container. The code is as follows:

    SPVideoRecorder.h

    #import <Foundation/Foundation.h>
    
    #include "libavcodec/avcodec.h"
    #include "libavformat/avformat.h"
    #include "libswscale/swscale.h"
    
    uint64_t getNanoseconds(void);
    
    @interface SPVideoRecorder : NSObject
    {
        NSString *movieFileName;
        CGFloat framesPerSecond;
        AVCodecContext *codecContext;
        AVStream *videoStream;
        AVOutputFormat *outputFormat;
        AVFormatContext *outputFormatContext;
        AVFrame *videoFrame;
        AVPicture inputRGBAFrame;
    
        uint8_t *pictureBuffer;
        uint8_t *outputBuffer;
        unsigned int outputBufferSize;
        int frameColorCounter;
    
        unsigned char *videoFrameToEncode;
    
        dispatch_queue_t videoRecordingQueue;
        dispatch_semaphore_t frameEncodingSemaphore;
        uint64_t movieStartTime;
    }
    
    @property(readwrite, assign) CGFloat framesPerSecond;
    @property(readwrite, assign) unsigned char *videoFrameToEncode;
    @property(readwrite, copy) NSString *movieFileName;
    
    // Movie recording control
    - (void)startRecordingMovie;
    - (void)encodeNewFrameToMovie;
    - (void)stopRecordingMovie;
    
    
    @end
    

    SPVideoRecorder.m

    #import "SPVideoRecorder.h"
    #include <sys/time.h>
    
    @implementation SPVideoRecorder
    
    uint64_t getNanoseconds(void)
    {
        struct timeval now;
        gettimeofday(&now, NULL);
        return now.tv_sec * NSEC_PER_SEC + now.tv_usec * NSEC_PER_USEC;
    }
    
    #pragma mark -
    #pragma mark Initialization and teardown
    
    - (id)init;
    {
        if (!(self = [super init]))
        {
            return nil;     
        }
    
        /* must be called before using avcodec lib */
        avcodec_init();
    
        /* register all the codecs */
        avcodec_register_all();
        av_register_all();
    
        av_log_set_level( AV_LOG_ERROR );
    
        videoRecordingQueue = dispatch_queue_create("com.sonoplot.videoRecordingQueue", NULL);;
        frameEncodingSemaphore = dispatch_semaphore_create(1);
    
        return self;
    }
    
    #pragma mark -
    #pragma mark Movie recording control
    
    - (void)startRecordingMovie;
    {   
        dispatch_async(videoRecordingQueue, ^{
            NSLog(@"Start recording to file: %@", movieFileName);
    
            const char *filename = [movieFileName UTF8String];
    
            // Use an MP4 container, in the standard QuickTime format so it's readable on the Mac
            outputFormat = av_guess_format("mov", NULL, NULL);
            if (!outputFormat) {
                NSLog(@"Could not set output format");
            }
    
            outputFormatContext = avformat_alloc_context();
            if (!outputFormatContext)
            {
                NSLog(@"avformat_alloc_context Error!");
            }
    
            outputFormatContext->oformat = outputFormat;
            snprintf(outputFormatContext->filename, sizeof(outputFormatContext->filename), "%s", filename);
    
            // Add a video stream to the MP4 file 
            videoStream = av_new_stream(outputFormatContext,0);
            if (!videoStream)
            {
                NSLog(@"av_new_stream Error!");
            }
    
    
            // Use the MPEG4 encoder (other DiVX-style encoders aren't compatible with this container, and x264 is GPL-licensed)
            AVCodec *codec = avcodec_find_encoder(CODEC_ID_MPEG4);  
            if (!codec) {
                fprintf(stderr, "codec not found\n");
                exit(1);
            }
    
            codecContext = videoStream->codec;
    
            codecContext->codec_id = codec->id;
            codecContext->codec_type = AVMEDIA_TYPE_VIDEO;
            codecContext->bit_rate = 4800000;
            codecContext->width = 640;
            codecContext->height = 480;
            codecContext->pix_fmt = PIX_FMT_YUV420P;
    //      codecContext->time_base = (AVRational){1,(int)round(framesPerSecond)};
    //      videoStream->time_base = (AVRational){1,(int)round(framesPerSecond)};
            codecContext->time_base = (AVRational){1,200}; // Set it to 200 FPS so that we give a little wiggle room when recording at 50 FPS
            videoStream->time_base = (AVRational){1,200};
    //      codecContext->max_b_frames = 3;
    //      codecContext->b_frame_strategy = 1;
            codecContext->qmin = 1;
            codecContext->qmax = 10;    
    //      codecContext->mb_decision = 2; // -mbd 2
    //      codecContext->me_cmp = 2; // -cmp 2
    //      codecContext->me_sub_cmp = 2; // -subcmp 2
            codecContext->keyint_min = (int)round(framesPerSecond); 
    //      codecContext->flags |= CODEC_FLAG_4MV; // 4mv
    //      codecContext->flags |= CODEC_FLAG_LOOP_FILTER;
            codecContext->i_quant_factor = 0.71;
            codecContext->qcompress = 0.6;
    //      codecContext->max_qdiff = 4;
            codecContext->flags2 |= CODEC_FLAG2_FASTPSKIP;
    
            if(outputFormat->flags & AVFMT_GLOBALHEADER)
            {
                codecContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
            }
    
            // Open the codec
            if (avcodec_open(codecContext, codec) < 0) 
            {
                NSLog(@"Couldn't initialize the codec");
                return;
            }
    
            // Open the file for recording
            if (avio_open(&outputFormatContext->pb, outputFormatContext->filename, AVIO_FLAG_WRITE) < 0) 
            { 
                NSLog(@"Couldn't open file");
                return;
            } 
    
            // Start by writing the video header
            if (avformat_write_header(outputFormatContext, NULL) < 0) 
            { 
                NSLog(@"Couldn't write video header");
                return;
            } 
    
            // Set up the video frame and output buffers
            outputBufferSize = 400000;
            outputBuffer = malloc(outputBufferSize);
            int size = codecContext->width * codecContext->height;
    
            int pictureBytes = avpicture_get_size(PIX_FMT_YUV420P, codecContext->width, codecContext->height);
            pictureBuffer = (uint8_t *)av_malloc(pictureBytes);
    
            videoFrame = avcodec_alloc_frame();
            videoFrame->data[0] = pictureBuffer;
            videoFrame->data[1] = videoFrame->data[0] + size;
            videoFrame->data[2] = videoFrame->data[1] + size / 4;
            videoFrame->linesize[0] = codecContext->width;
            videoFrame->linesize[1] = codecContext->width / 2;
            videoFrame->linesize[2] = codecContext->width / 2;
    
            avpicture_alloc(&inputRGBAFrame, PIX_FMT_BGRA, codecContext->width, codecContext->height);
    
            frameColorCounter = 0;
    
            movieStartTime = getNanoseconds();
        });
    }
    
    - (void)encodeNewFrameToMovie;
    {
    //  NSLog(@"Encode frame");
    
        if (dispatch_semaphore_wait(frameEncodingSemaphore, DISPATCH_TIME_NOW) != 0)
        {
            return;
        }
    
        dispatch_async(videoRecordingQueue, ^{
    //      CFTimeInterval previousTimestamp = CFAbsoluteTimeGetCurrent();
            frameColorCounter++;
    
            if (codecContext == NULL)
            {       
                return;
            }
    
            // Take the input BGRA texture data and convert it to a YUV 4:2:0 planar frame
            avpicture_fill(&inputRGBAFrame, videoFrameToEncode, PIX_FMT_BGRA, codecContext->width, codecContext->height);
            struct SwsContext * img_convert_ctx = sws_getContext(codecContext->width, codecContext->height, PIX_FMT_BGRA, codecContext->width, codecContext->height, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL); 
            sws_scale(img_convert_ctx, (const uint8_t* const *)inputRGBAFrame.data, inputRGBAFrame.linesize, 0, codecContext->height, videoFrame->data, videoFrame->linesize);
    
            // Encode the frame
            int out_size = avcodec_encode_video(codecContext, outputBuffer, outputBufferSize, videoFrame);  
    
            // Generate a packet and insert in the video stream
            if (out_size != 0) 
            {
                AVPacket videoPacket;
                av_init_packet(&videoPacket);
    
                if (codecContext->coded_frame->pts != AV_NOPTS_VALUE) 
                {
                    uint64_t currentFrameTime = getNanoseconds();
    
                    videoPacket.pts = av_rescale_q(((uint64_t)currentFrameTime - (uint64_t)movieStartTime) / 1000ull/*codecContext->coded_frame->pts*/, AV_TIME_BASE_Q/*codecContext->time_base*/, videoStream->time_base);
    
    //              NSLog(@"Frame time %lld, converted time: %lld", ((uint64_t)currentFrameTime - (uint64_t)movieStartTime) / 1000ull, videoPacket.pts);
                }
    
                if(codecContext->coded_frame->key_frame)
                {
                    videoPacket.flags |= AV_PKT_FLAG_KEY;
                }
                videoPacket.stream_index = videoStream->index;
                videoPacket.data = outputBuffer;
                videoPacket.size = out_size;
    
                int ret = av_write_frame(outputFormatContext, &videoPacket);
                if (ret < 0) 
                { 
                    av_log(outputFormatContext, AV_LOG_ERROR, "%s","Error while writing frame.\n"); 
                    av_free_packet(&videoPacket);
                    return;
                } 
    
                av_free_packet(&videoPacket);
            }
    
    //      CFTimeInterval frameDuration = CFAbsoluteTimeGetCurrent() - previousTimestamp;
    //      NSLog(@"Frame duration: %f ms", frameDuration * 1000.0);
    
            dispatch_semaphore_signal(frameEncodingSemaphore);
        });
    }
    
    - (void)stopRecordingMovie;
    {
        dispatch_async(videoRecordingQueue, ^{
            // Write out the video trailer
            if (av_write_trailer(outputFormatContext) < 0) 
            { 
                av_log(outputFormatContext, AV_LOG_ERROR, "%s","Error while writing trailer.\n"); 
                exit(1); 
            } 
    
            // Close out the file
            if (!(outputFormat->flags & AVFMT_NOFILE)) 
            {
                avio_close(outputFormatContext->pb);
            }
    
            // Free up all movie-related resources
            avcodec_close(codecContext);
            av_free(codecContext);
            codecContext = NULL;
    
            free(pictureBuffer);
            free(outputBuffer);
    
            av_free(videoFrame);
            av_free(outputFormatContext);
            av_free(videoStream);       
        });
    
    }
    
    #pragma mark -
    #pragma mark Accessors
    
    @synthesize framesPerSecond, videoFrameToEncode, movieFileName;
    
    @end
    

    This works under Lion and Snow Leopard in a 64-bit application. It records at the same bitrate as my previous QuickTime-based approach, with overall lower CPU usage.

    Hopefully, this will help out someone else in a similar situation.