ios video avfoundation avassetwriter avassetexportsession

Simulate AVLayerVideoGravityResizeAspectFill: crop and center video to mimic preview without losing sharpness

Based on this SO post, the code below rotates, centers, and crops a video captured live by the user.

The capture session uses AVCaptureSessionPresetHigh for the preset value, and the preview layer uses AVLayerVideoGravityResizeAspectFill for video gravity. This preview is extremely sharp.

The exported video, however, is not as sharp, ostensibly because scaling from the 1920x1080 resolution for the back camera on the 5S to 320x568 (target size for the exported video) introduces fuzziness from throwing away pixels?

Assuming there is no way to scale from 1920x1080 to 320x568 without some fuzziness, the question becomes: how to mimic the sharpness of the preview layer?

Somehow Apple is using an algorithm to convert a 1920x1080 video into a crisp-looking preview frame of 320x568.

Is there a way to mimic this with either AVAssetWriter or AVAssetExportSession?

func cropVideo() {
    // Set start time
    let startTime = NSDate().timeIntervalSince1970

    // Create main composition & its tracks
    let mainComposition = AVMutableComposition()
    let compositionVideoTrack = mainComposition.addMutableTrackWithMediaType(AVMediaTypeVideo, preferredTrackID: CMPersistentTrackID(kCMPersistentTrackID_Invalid))
    let compositionAudioTrack = mainComposition.addMutableTrackWithMediaType(AVMediaTypeAudio, preferredTrackID: CMPersistentTrackID(kCMPersistentTrackID_Invalid))

    // Get source video & audio tracks
    let videoPath = getFilePath(curSlice!.getCaptureURL())
    let videoURL = NSURL(fileURLWithPath: videoPath)
    let videoAsset = AVURLAsset(URL: videoURL, options: nil)
    let sourceVideoTrack = videoAsset.tracksWithMediaType(AVMediaTypeVideo)[0]
    let sourceAudioTrack = videoAsset.tracksWithMediaType(AVMediaTypeAudio)[0]
    let videoSize = sourceVideoTrack.naturalSize

    // Get rounded time for video
    let roundedDur = floor(curSlice!.getDur() * 100) / 100
    let videoDur = CMTimeMakeWithSeconds(roundedDur, 100)

    // Add source tracks to composition
    do {
        try compositionVideoTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, videoDur), ofTrack: sourceVideoTrack, atTime: kCMTimeZero)
        try compositionAudioTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, videoDur), ofTrack: sourceAudioTrack, atTime: kCMTimeZero)
    } catch {
        print("Error with insertTimeRange while exporting video: \(error)")
    }

    // Create video composition
    // -- Set video frame
    let outputSize = view.bounds.size
    let videoComposition = AVMutableVideoComposition()
    print("Video composition duration: \(CMTimeGetSeconds(mainComposition.duration))")

    // -- Set parent layer
    let parentLayer = CALayer()
    parentLayer.frame = CGRectMake(0, 0, outputSize.width, outputSize.height)
    parentLayer.contentsGravity = kCAGravityResizeAspectFill

    // -- Set composition props
    videoComposition.renderSize = CGSize(width: outputSize.width, height: outputSize.height)
    videoComposition.frameDuration = CMTimeMake(1, Int32(frameRate))

    // -- Create video composition instruction
    let instruction = AVMutableVideoCompositionInstruction()
    instruction.timeRange = CMTimeRangeMake(kCMTimeZero, videoDur)

    // -- Use layer instruction to match video to output size, mimicking AVLayerVideoGravityResizeAspectFill
    let videoLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: compositionVideoTrack)
    let videoTransform = getResizeAspectFillTransform(videoSize, outputSize: outputSize)
    videoLayerInstruction.setTransform(videoTransform, atTime: kCMTimeZero)

    // -- Add layer instruction
    instruction.layerInstructions = [videoLayerInstruction]
    videoComposition.instructions = [instruction]

    // -- Create video layer
    let videoLayer = CALayer()
    videoLayer.frame = parentLayer.frame

    // -- Add sublayers to parent layer
    parentLayer.addSublayer(videoLayer)

    // -- Set animation tool
    videoComposition.animationTool = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: videoLayer, inLayer: parentLayer)

    // Create exporter
    let outputURL = getFilePath(getUniqueFilename(gMP4File))
    let exporter = AVAssetExportSession(asset: mainComposition, presetName: AVAssetExportPresetHighestQuality)!
    exporter.outputURL = NSURL(fileURLWithPath: outputURL)
    exporter.outputFileType = AVFileTypeMPEG4
    exporter.videoComposition = videoComposition
    exporter.shouldOptimizeForNetworkUse = true
    exporter.canPerformMultiplePassesOverSourceMediaData = true

    // Export to video
    exporter.exportAsynchronouslyWithCompletionHandler({
        // Log status
        let asset = AVAsset(URL: exporter.outputURL!)
        print("Exported slice video. Tracks: \(asset.tracks.count). Duration: \(CMTimeGetSeconds(asset.duration)). Size: \(exporter.estimatedOutputFileLength). Status: \(getExportStatus(exporter)). Output URL: \(exporter.outputURL!). Export time: \( NSDate().timeIntervalSince1970 - startTime).")

        // Tell delegate
        //delegate.didEndExport(exporter)
        self.curSlice!.setOutputURL(exporter.outputURL!.lastPathComponent!)
        gUser.save()
    })
}


// Returns transform, mimicking AVLayerVideoGravityResizeAspectFill, that converts video of <inputSize> to one of <outputSize>
private func getResizeAspectFillTransform(videoSize: CGSize, outputSize: CGSize) -> CGAffineTransform {
    // Compute ratios between video & output sizes
    let widthRatio = outputSize.width / videoSize.width
    let heightRatio = outputSize.height / videoSize.height

    // Set scale to larger of two ratios since goal is to fill output bounds
    let scale = widthRatio >= heightRatio ? widthRatio : heightRatio

    // Compute video size after scaling
    let newWidth = videoSize.width * scale
    let newHeight = videoSize.height * scale

    // Compute translation required to center image after scaling
    // -- Assumes CoreAnimationTool places video frame at (0, 0). Because scale transform is applied first, we must adjust
    // each translation point by scale factor.
    let translateX = (outputSize.width - newWidth) / 2 / scale
    let translateY = (outputSize.height - newHeight) / 2 / scale

    // Set transform to resize video while retaining aspect ratio
    let resizeTransform = CGAffineTransformMakeScale(scale, scale)

    // Apply translation & create final transform
    let finalTransform = CGAffineTransformTranslate(resizeTransform, translateX, translateY)

    // Return final transform
    return finalTransform
}

320x568 video taken with Tim's code:

640x1136 video taken with Tim's code:

Solution

Try this. Start a new Single View project in Swift, replace the ViewController with this code and you should be good to go!

I've set up a previewLayer which is a different size from the output, change it at the top of the file.

I added some basic orientation support. Outputs slightly different sizes for Landscape Vs. Portrait. You can specify whatever video size dimensions you like in here and it should work fine.

Checkout the videoSettings dictionary (line 278ish) for the codec and sizes of the output file. You can also add other settings in here to deal with keyFrameIntervals etc. to tweak outputsize.

I added a recording image to show when it's recording (Tap starts, tap ends), you'll need to add some asset into Assets.xcassets called recording (or comment out that line 106 where it trys to load it).

That's pretty much it. Good luck!

Oh, it's dumping the video into a project directory, you'll need to go to Window / Devices and download the Container to see the video easily. In the TODO there's a section where you could hook in and copy the file to the PhotoLibrary (makes testing WAY easier).

import UIKit
import AVFoundation

class ViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate, AVCaptureAudioDataOutputSampleBufferDelegate {

let CAPTURE_SIZE_LANDSCAPE: CGSize = CGSizeMake(1280, 720)
let CAPTURE_SIZE_PORTRAIT: CGSize = CGSizeMake(720, 1280)

var recordingImage : UIImageView = UIImageView()

var previewLayer : AVCaptureVideoPreviewLayer?

var audioQueue : dispatch_queue_t?
var videoQueue : dispatch_queue_t?

let captureSession = AVCaptureSession()
var assetWriter : AVAssetWriter?
var assetWriterInputCamera : AVAssetWriterInput?
var assetWriterInputAudio : AVAssetWriterInput?
var outputConnection: AVCaptureConnection?

var captureDeviceBack : AVCaptureDevice?
var captureDeviceFront : AVCaptureDevice?
var captureDeviceMic : AVCaptureDevice?
var sessionSetupDone: Bool = false

var isRecordingStarted = false
//var recordingStartedTime = kCMTimeZero
var videoOutputURL : NSURL?

var captureSize: CGSize = CGSizeMake(1280, 720)
var previewFrame: CGRect = CGRectMake(0, 0, 180, 360)

var captureDeviceTrigger = true
var captureDevice: AVCaptureDevice? {
    get {
        return captureDeviceTrigger ? captureDeviceFront : captureDeviceBack
    }
}

override func supportedInterfaceOrientations() -> UIInterfaceOrientationMask {
    return UIInterfaceOrientationMask.AllButUpsideDown
}

override func shouldAutorotate() -> Bool {
    if isRecordingStarted {
        return false
    }

    if UIDevice.currentDevice().orientation == UIDeviceOrientation.PortraitUpsideDown {
        return false
    }

    if let cameraPreview = self.previewLayer {
        if let connection = cameraPreview.connection {
            if connection.supportsVideoOrientation {
                switch UIDevice.currentDevice().orientation {
                case .LandscapeLeft:
                    connection.videoOrientation = .LandscapeRight
                case .LandscapeRight:
                    connection.videoOrientation = .LandscapeLeft
                case .Portrait:
                    connection.videoOrientation = .Portrait
                case .FaceUp:
                    return false
                case .FaceDown:
                    return false
                default:
                    break
                }
            }
        }
    }

    return true
}

override func viewDidLoad() {
    super.viewDidLoad()

    setupViewControls()

    //self.recordingStartedTime = kCMTimeZero

    // Setup capture session related logic
    videoQueue = dispatch_queue_create("video_write_queue", DISPATCH_QUEUE_SERIAL)
    audioQueue = dispatch_queue_create("audio_write_queue", DISPATCH_QUEUE_SERIAL)

    setupCaptureDevices()
    pre_start()
}

//MARK: UI methods
func setupViewControls() {

    // TODO: I have an image (red circle) in an Assets.xcassets. Replace the following with your own image
    recordingImage.frame = CGRect(x: 0, y: 0, width: 50, height: 50)
    recordingImage.image = UIImage(named: "recording")
    recordingImage.hidden = true
    self.view.addSubview(recordingImage)


    // Setup tap to record and stop
    let tapGesture = UITapGestureRecognizer(target: self, action: "didGetTapped:")
    tapGesture.numberOfTapsRequired = 1
    self.view.addGestureRecognizer(tapGesture)

}



func didGetTapped(selector: UITapGestureRecognizer) {
    if self.isRecordingStarted {
        self.view.gestureRecognizers![0].enabled = false
        recordingImage.hidden = true

        self.stopRecording()
    } else {
        recordingImage.hidden = false
        self.startRecording()
    }

    self.isRecordingStarted = !self.isRecordingStarted
}

func switchCamera(selector: UIButton) {
    self.captureDeviceTrigger = !self.captureDeviceTrigger

    pre_start()
}

//MARK: Video logic
func setupCaptureDevices() {
    let devices = AVCaptureDevice.devices()

    for device in devices {
        if  device.hasMediaType(AVMediaTypeVideo) {
            if device.position == AVCaptureDevicePosition.Front {
                captureDeviceFront = device as? AVCaptureDevice
                NSLog("Video Controller: Setup. Front camera is found")
            }
            if device.position == AVCaptureDevicePosition.Back {
                captureDeviceBack = device as? AVCaptureDevice
                NSLog("Video Controller: Setup. Back camera is found")
            }
        }

        if device.hasMediaType(AVMediaTypeAudio) {
            captureDeviceMic = device as? AVCaptureDevice
            NSLog("Video Controller: Setup. Audio device is found")
        }
    }
}

func alertPermission() {
    let permissionAlert = UIAlertController(title: "No Permission", message: "Please allow access to Camera and Microphone", preferredStyle: UIAlertControllerStyle.Alert)
    permissionAlert.addAction(UIAlertAction(title: "Go to settings", style: .Default, handler: { (action: UIAlertAction!) in
        print("Video Controller: Permission for camera/mic denied. Going to settings")
        UIApplication.sharedApplication().openURL(NSURL(string: UIApplicationOpenSettingsURLString)!)
        print(UIApplicationOpenSettingsURLString)
    }))
    presentViewController(permissionAlert, animated: true, completion: nil)
}

func pre_start() {
    NSLog("Video Controller: pre_start")
    let videoPermission = AVCaptureDevice.authorizationStatusForMediaType(AVMediaTypeVideo)
    let audioPermission = AVCaptureDevice.authorizationStatusForMediaType(AVMediaTypeAudio)
    if  (videoPermission ==  AVAuthorizationStatus.Denied) || (audioPermission ==  AVAuthorizationStatus.Denied) {
        self.alertPermission()
        pre_start()
        return
    }

    if (videoPermission == AVAuthorizationStatus.Authorized) {
        self.start()
        return
    }

    AVCaptureDevice.requestAccessForMediaType(AVMediaTypeVideo, completionHandler: { (granted :Bool) -> Void in
        self.pre_start()
    })
}

func start() {
    NSLog("Video Controller: start")
    if captureSession.running {
        captureSession.beginConfiguration()

        if let currentInput = captureSession.inputs[0] as? AVCaptureInput {
            captureSession.removeInput(currentInput)
        }

        do {
            try captureSession.addInput(AVCaptureDeviceInput(device: captureDevice))
        } catch {
            print("Video Controller: begin session. Error adding video input device")
        }

        captureSession.commitConfiguration()
        return
    }

    do {
        try captureSession.addInput(AVCaptureDeviceInput(device: captureDevice))
        try captureSession.addInput(AVCaptureDeviceInput(device: captureDeviceMic))
    } catch {
        print("Video Controller: start. error adding device: \(error)")
    }

    if let layer = AVCaptureVideoPreviewLayer(session: captureSession) {
        self.previewLayer = layer
        layer.videoGravity = AVLayerVideoGravityResizeAspect

        if let layerConnection = layer.connection {
            if UIDevice.currentDevice().orientation == .LandscapeRight {
                layerConnection.videoOrientation = AVCaptureVideoOrientation.LandscapeLeft
            } else if UIDevice.currentDevice().orientation == .LandscapeLeft {
                layerConnection.videoOrientation = AVCaptureVideoOrientation.LandscapeRight
            } else if UIDevice.currentDevice().orientation == .Portrait {
                layerConnection.videoOrientation = AVCaptureVideoOrientation.Portrait
            }
        }

        // TODO: Set the output size of the Preview Layer here
        layer.frame = previewFrame
        self.view.layer.insertSublayer(layer, atIndex: 0)

    }

    let bufferVideoQueue = dispatch_queue_create("sample buffer delegate", DISPATCH_QUEUE_SERIAL)
    let videoOutput = AVCaptureVideoDataOutput()
    videoOutput.setSampleBufferDelegate(self, queue: bufferVideoQueue)
    captureSession.addOutput(videoOutput)
    if let connection = videoOutput.connectionWithMediaType(AVMediaTypeVideo) {
        self.outputConnection = connection
    }

    let bufferAudioQueue = dispatch_queue_create("audio buffer delegate", DISPATCH_QUEUE_SERIAL)
    let audioOutput = AVCaptureAudioDataOutput()
    audioOutput.setSampleBufferDelegate(self, queue: bufferAudioQueue)
    captureSession.addOutput(audioOutput)

    captureSession.startRunning()
}

func getAssetWriter() -> AVAssetWriter? {
    NSLog("Video Controller: getAssetWriter")
    let fileManager = NSFileManager.defaultManager()
    let urls = fileManager.URLsForDirectory(.DocumentDirectory, inDomains: .UserDomainMask)
    guard let documentDirectory: NSURL = urls.first else {
        print("Video Controller: getAssetWriter: documentDir Error")
        return nil
    }

    let local_video_name = NSUUID().UUIDString + ".mp4"
    self.videoOutputURL = documentDirectory.URLByAppendingPathComponent(local_video_name)

    guard let url = self.videoOutputURL else {
        return nil
    }


    self.assetWriter = try? AVAssetWriter(URL: url, fileType: AVFileTypeMPEG4)

    guard let writer = self.assetWriter else {
        return nil
    }

    let videoSettings: [String : AnyObject] = [
        AVVideoCodecKey  : AVVideoCodecH264,
        AVVideoWidthKey  : captureSize.width,
        AVVideoHeightKey : captureSize.height,
    ]

    assetWriterInputCamera = AVAssetWriterInput(mediaType: AVMediaTypeVideo, outputSettings: videoSettings)
    assetWriterInputCamera?.expectsMediaDataInRealTime = true
    writer.addInput(assetWriterInputCamera!)

    let audioSettings : [String : AnyObject] = [
        AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
        AVNumberOfChannelsKey : 2,
        AVSampleRateKey : NSNumber(double: 44100.0)
    ]

    assetWriterInputAudio = AVAssetWriterInput(mediaType: AVMediaTypeAudio, outputSettings: audioSettings)
    assetWriterInputAudio?.expectsMediaDataInRealTime = true
    writer.addInput(assetWriterInputAudio!)

    return writer
}

func configurePreset() {
    NSLog("Video Controller: configurePreset")
    if captureSession.canSetSessionPreset(AVCaptureSessionPreset1280x720) {
        captureSession.sessionPreset = AVCaptureSessionPreset1280x720
    } else {
        captureSession.sessionPreset = AVCaptureSessionPreset1920x1080
    }
}

func startRecording() {
    NSLog("Video Controller: Start recording")

    captureSize = UIDeviceOrientationIsLandscape(UIDevice.currentDevice().orientation) ? CAPTURE_SIZE_LANDSCAPE : CAPTURE_SIZE_PORTRAIT

    if let connection = self.outputConnection {

        if connection.supportsVideoOrientation {

            if UIDevice.currentDevice().orientation == .LandscapeRight {
                connection.videoOrientation = AVCaptureVideoOrientation.LandscapeLeft
                NSLog("orientation: right")
            } else if UIDevice.currentDevice().orientation == .LandscapeLeft {
                connection.videoOrientation = AVCaptureVideoOrientation.LandscapeRight
                NSLog("orientation: left")
            } else {
                connection.videoOrientation = AVCaptureVideoOrientation.Portrait
                NSLog("orientation: portrait")
            }
        }
    }

    if let writer = getAssetWriter() {
        self.assetWriter = writer

        let recordingClock = self.captureSession.masterClock
        writer.startWriting()
        writer.startSessionAtSourceTime(CMClockGetTime(recordingClock))
    }

}

func stopRecording() {
    NSLog("Video Controller: Stop recording")

    if let writer = self.assetWriter {
        writer.finishWritingWithCompletionHandler{Void in
            print("Recording finished")
            // TODO: Handle the video file, copy it from the temp directory etc.
        }
    }
}

//MARK: Implementation for AVCaptureVideoDataOutputSampleBufferDelegate, AVCaptureAudioDataOutputSampleBufferDelegate
func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {

    if !self.isRecordingStarted {
        return
    }

    if let audio = self.assetWriterInputAudio where connection.audioChannels.count > 0 && audio.readyForMoreMediaData {

        dispatch_async(audioQueue!) {
            audio.appendSampleBuffer(sampleBuffer)
        }
        return
    }

    if let camera = self.assetWriterInputCamera where camera.readyForMoreMediaData {
        dispatch_async(videoQueue!) {
            camera.appendSampleBuffer(sampleBuffer)
        }
    }
}
}

Additional Edit Info

Its seems from our additional conversations in the comments that what you want is to reduce the physical size of the output video while keeping the dimensions as high as you can (to retain quality). Remember, the size you position a layer on the screen is POINTs, not PIXELS. You're writing an output file in pixels - it's not a 1:1 comparison to the iPhone screen reference units.

To reduce the size of the output file, you have two easy options:

Reduce the resolution - but if you go too small, you'll lose quality when playing it back, especially if when playing it back it gets scaled up again. Try 640x360 or 720x480 for the output pixels.
Adjust the compression settings. The iPhone has default settings that typically produce a higher quality (larger output file size) video.

Replace the video settings with these options and see how you go:

    let videoSettings: [String : AnyObject] = [
        AVVideoCodecKey  : AVVideoCodecH264,
        AVVideoWidthKey  : captureSize.width,
        AVVideoHeightKey : captureSize.height,
        AVVideoCompressionPropertiesKey : [
            AVVideoAverageBitRateKey : 2000000,
            AVVideoProfileLevelKey : H264_Main_4_1,
            AVVideoMaxKeyFrameIntervalKey : 90,
        ]
    ]

The AVCompressionProperties tell AVFoundation how to actually compress the video. The lower the bit rate, the higher the compression (and therefore the better it streams but ALSO the less disk space it uses BUT it will have lower quality). MaxKeyFrame interval is how often it writes out an uncompressed frame, setting this higher (in our ~30 frames per second video 90 will be once every 1.5 seconds) also reduces quality but decreases size too. You'll find the constants referenced here https://developer.apple.com/library/prerelease/ios/documentation/AVFoundation/Reference/AVFoundation_Constants/index.html#//apple_ref/doc/constant_group/Video_Settings