My original clip was 22:47 long. I captured the video in avi with Ut Video Lossless Codec at 29.97 fps, with pcm 16 bit unsigned audio. I am using Virtualdub with VHScrCap driver for capture. Virtualdub and mpc and potplayer play the captured file apparently too fast, but with the right audio pitch in the first 3-4 min, but high pitch in the rest of the video. The duration is 19:06, shorter than the original 22:47 (confirmed by mediainfo) The cause of the problem seems to be that I am losing more frames when capturing large HD frames.
Regular encoding
Encoding captured clip to mp4:
ffmpeg -ss 3.25 -i input.avi -map 0:0 -map 0:1 -threads 0 -c:v libx264 -profile:v main \
-preset:v medium -level 3.1 -x264opts crf=26.0 -aspect 16:9 -t 1112.69 \
-y -f mp4 -vf "crop=1432:808:4:46, hqdn3d=1.5:1.5:6:6, \
scale=1216:684, pad=1280:720:32:18" -c:a ac3 -ac 2 -ar 48000 -b:a 160k \
output.mp4
The output is 18:32 long, framerate is still 29:97. The audio pitch is OK in the first 2 minutes, and way too high in the rest of the video.
Trying to correct
I try to correct it in three steps by (1) encoding a video stream that is slowed down to 23.976 fps and extracting a wav audio stream, (2) slowing speed and pitch of audio and (3) remuxing video and audio: (1)
ffmpeg -ss 3.25 -i input.avi -threads 0 \
-c:v libx264 -profile:v main -preset:v medium -level 3.1 -x264opts crf=26.0 \
-aspect 16:9 -t 1390.862 -an -y -f mp4 -r 24000/1001 \
-vf "crop=1432:808:4:46, hqdn3d=1.5:1.5:6:6, scale=1216:684, pad=1280:720:32:18, \
setpts=1.25*PTS" video_out.mp4 \
-t 1112.69 -y -vn -f wav audio_out.wav
(2) The wav audio stream is then slowed down with lower pitch with sox:
sox --norm audio_out.mp4.wav audio_out-24.wav speed 0.8
(3) The two streams are then remuxed with:
ffmpeg -i video_out.mp4 -i audio_out-24.wav -map 0:0 -map 1:0 -c:v copy \
-c:a ac3 -ac 2 -af aresample=resampler=soxr -ar 48000 -b:a 160k \
final_output.mp4
This time, the video duration (23:10) is closer to the original, the pitch is OK for the whole video except for the first 2-3 minutes, where it is (predictably) too low.
I have a sense that (1) the capture log, and ffprobe give the frame by frame information that show what is the 'instantaneous' real frame rate, and (2) that information is not used by ffmpeg encoding, but presumably could be used to correct the frame rate by inserting duplicate or interpolated frames to restitute the correct frame rate. I suspect I could get the information from (1), but have no clue how to do (2).
If someone familiar with this type of issue could give me some advice, and point me in the right direction, I would really appreciate.
Well, if anyone is interested, here is where I stand.
I am not sure if this is THE answer, but it is my answer for now. I found out that trying to correct and improve a poorly captured video is not a very good idea. This is what I am now trying to do to avoid loss of frames during capture and obtain a good quality video. Note: an easy way to find out if the capture is good is to watch the number of inserted frames vs total frames captured. (I use VirtualDub to capture, and those numbers are displayed in real time). Try to get zero inserted frame.
Given those precautions, I can capture these videos with virtually no lost frame, and then smooth playing.
For further study: I have been wondering if trading a lower frame rate for a higher definition could be a good trade off. For example, capturing at 20 fps instead of 23.976, and then find a way do add frames later in a way that does not shock the eye. (I assume that should be done with avisynth's ConvertFPS() function, not ffmpeg) I have not done any experimentation of this method yet.