Search code examples
iosavfoundationavplayeravvideocomposition

AVPlayer plays video composition result incorrectly


I need a simple thing: play a video while rotating and applying CIFilter on it.

First, I create the player item:

AVPlayerItem *item = [AVPlayerItem playerItemWithURL:videoURL];

// DEBUG LOGGING
AVAssetTrack *track = [[item.asset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
NSLog(@"Natural size is: %@", NSStringFromCGSize(track.naturalSize));
NSLog(@"Preferred track transform is: %@", NSStringFromCGAffineTransform(track.preferredTransform));
NSLog(@"Preferred asset transform is: %@", NSStringFromCGAffineTransform(item.asset.preferredTransform));

Then I need to apply the video composition. Originally, I was thinking to create an AVVideoComposition with 2 instructions - one will be the AVVideoCompositionLayerInstruction for rotation and the other one will be CIFilter application. However, I got an exception thrown saying "Expecting video composition to contain only AVCoreImageFilterVideoCompositionInstruction" which means Apple doesn't allow to combine those 2 instructions. As a result, I combined both under the filtering, here is the code:

AVAsset *asset = playerItem.asset;
CGAffineTransform rotation = [self transformForItem:playerItem];

AVVideoComposition *composition = [AVVideoComposition videoCompositionWithAsset:asset applyingCIFiltersWithHandler:^(AVAsynchronousCIImageFilteringRequest * _Nonnull request) {
    // Step 1: get the input frame image (screenshot 1)
    CIImage *sourceImage = request.sourceImage;

    // Step 2: rotate the frame
    CIFilter *transformFilter = [CIFilter filterWithName:@"CIAffineTransform"];
    [transformFilter setValue:sourceImage forKey: kCIInputImageKey];
    [transformFilter setValue: [NSValue valueWithCGAffineTransform: rotation] forKey: kCIInputTransformKey];
    sourceImage = transformFilter.outputImage;
    CGRect extent = sourceImage.extent;
    CGAffineTransform translation = CGAffineTransformMakeTranslation(-extent.origin.x, -extent.origin.y);
    [transformFilter setValue:sourceImage forKey: kCIInputImageKey];
    [transformFilter setValue: [NSValue valueWithCGAffineTransform: translation] forKey: kCIInputTransformKey];
    sourceImage = transformFilter.outputImage;

    // Step 3: apply the custom filter chosen by the user
    extent = sourceImage.extent;
    sourceImage = [sourceImage imageByClampingToExtent];
    [filter setValue:sourceImage forKey:kCIInputImageKey];
    sourceImage = filter.outputImage;
    sourceImage = [sourceImage imageByCroppingToRect:extent];
    
    // Step 4: finish processing the frame (screenshot 2)
    [request finishWithImage:sourceImage context:nil];
}];

playerItem.videoComposition = composition;

The screenshots I made during debugging show that the image is successfully rotated and the filter is applied (in this example it was an identity filter which doesn't change the image). Here are the screenshot 1 and screenshot 2 which were taken at the points marked in the comments above:

enter image description here

enter image description here

As you can see, the rotation is successful, the extent of the resulting frame was also correct.

The problem starts when I try to play this video in a player. Here is what I get:

enter image description here

So seems like all the frames are scaled and shifted down. The green area is the empty frame info, when I clamp to extent to make frame infinite size it shows border pixels instead of green. I have a feeling that the player still takes some old size info before rotation from the AVPlayerItem, that's why in the first code snippet above I was logging the sizes and transforms, there are the logs:

Natural size is: {1920, 1080}
Preferred track transform is: [0, 1, -1, 0, 1080, 0]
Preferred asset transform is: [1, 0, 0, 1, 0, 0]

The player is set up like this:

layer.videoGravity = AVLayerVideoGravityResizeAspectFill;
layer.needsDisplayOnBoundsChange = YES;

PLEASE NOTE the most important thing: this only happens to videos which were recorded by the app itself using camera in landscape iPhone[6s] orientation and saved on the device storage previously. The videos that the app records in portrait mode are totally fine (by the way, the portrait videos got exactly the same size and transform log like landscape videos! strange...maybe iphone puts the rotation info in the video and fixes it). So zooming and shifting the video seems like a combination of "aspect fill" and old resolution info before rotation. By the way, the portrait video frames are shown partially because of scaling to fill the player area which has a different aspect ratio, but this is expected behavior.

Let me know your thoughts on this and, if you know a better way how to accomplish what I need, then it would be great to know.


Solution

  • UPDATE: There comes out to be an easier way to "change" the AVPlayerItem video dimensions during playback - set the renderSize property of video composition (can be done using AVMutableVideoComposition class).

    MY OLD ANSWER BELOW:


    After a lot of debugging I understood the problem and found a solution. My initial guess that AVPlayer still considers the video being of the original size was correct. In the image below it is explained what was happening:

    enter image description here

    As for the solution, I couldn't find a way to change the video size inside AVAsset or AVPlayerItem. So I just manipulated the video to fit the size and scale that AVPlayer was expecting, and then when playing in a player with correct aspect ratio and flag to scale and fill the player area - everything looks good. Here is the graphical explanation:

    enter image description here

    And here goes the additional code that needs to be inserted in the applyingCIFiltersWithHandler block mentioned in the question:

    ... after Step 3 in the question codes above
    
    // make the frame the same aspect ratio as the original input frame
    // by adding empty spaces at the top and the bottom of the extent rectangle
    CGFloat newHeight = originalExtent.size.height * originalExtent.size.height / extent.size.height;
    CGFloat inset = (extent.size.height - newHeight) / 2;
    extent = CGRectInset(extent, 0, inset);
    sourceImage = [sourceImage imageByCroppingToRect:extent];
    
    // scale down to the original frame size
    CGFloat scale = originalExtent.size.height / newHeight;
    CGAffineTransform scaleTransform = CGAffineTransformMakeScale(scale, scale);
    [transformFilter setValue:sourceImage forKey: kCIInputImageKey];
    [transformFilter setValue: [NSValue valueWithCGAffineTransform: scaleTransform] forKey: kCIInputTransformKey];
    sourceImage = transformFilter.outputImage;
    
    // translate the frame to make it's origin start at (0, 0)
    CGAffineTransform translation = CGAffineTransformMakeTranslation(0, -inset * scale);
    [transformFilter setValue:sourceImage forKey: kCIInputImageKey];
    [transformFilter setValue: [NSValue valueWithCGAffineTransform: translation] forKey: kCIInputTransformKey];
    sourceImage = transformFilter.outputImage;