Search code examples
ffmpegh.264video-encodinglibx264video-compression

What ffmpeg arguments will approximate Zoom recording quality


I've been recording screen sharing presentations using Quicktime on my Mac and it uses x264 format with ~60fps. The produced video file is with type MOV and around 2.2GB for 1 hour of presentation. I want to compress it using ffmpeg and I've been doing so using x264 as well. Here are my arguments:

ffmpeg -i '$inputFile' -vcodec "libx264" -crf 32 -vf 'scale=${width}:-2,fps=24' -c:a aac -b:a 128k -preset veryslow -profile:v high -tune stillimage -f mp4 '$outputFile'

I rescale my video to 1600px width to save on space and I also convert the recording to 24fps as I see no need to have the full ~60fps available. It's mostly static images as I talk over my screen. This results in about 100MB file using the -profile:v high argument. Otherwise it is around 160MB.

On the other hand Zoom recording for much larger resolutions (4k etc) are around ~80MB per 1 hour. Does anyone know what options we can use to approximate this file size and quality? I know they are using lower quality audio which might explain some of the difference.

But if I increase the -crf 32 argument it starts to degrade quality too much. I am not sure how Zoom achieves it's video quality with high resolutions such as 1080p and 4k with a file size of ~80MB while I can't match it using 1600px width.

Edit: I had an idea that I probably don't need all 24 fps in a screen sharing of static content. So I reduced it to 5 fps and that seems to work well for my use case. I wonder if this is what Zoom does?


Solution

  • Looks like I was able to find out how to do variable frame rate encoding using ffmpeg and preserve the audio and video synchronization.

    I used the arguments -vf mpdecimate -vsync vfr in order to remove the duplicate frames. In a screen recording we have a lot of duplicate frames so removing them results in a much smaller file size. I also use -cfr 36 as I found it to still produce good results since now I encode in native resolution.

    This provides a very good native resolution encoding with slightly larger file size than what Zoom does but also the result has better quality for both audio and video. I am happy with it.

    I created a tool to automate the conversion: https://github.com/stanimirivanovde/general-tools/tree/master/ffmpeg-encoding

    I also tried experimenting with x265 but I didn't find it superior. The encoding speed was much slower than x264 and this really is a no go. I tried to increase the cfr to 40 but this resulted in poor text quality.