For my application I'm adapting the code from the ffmpeg-python github for tensorflow streaming.
It basically decodes each frame of your input, lets you process it with python, and encodes it again.
To optimize things I'm adding a fps filter at the input to get half the fps, so I can process only half the frames, then interpolate frames in the encoder with minterpolate to get the original fps.
My decoder looks like this:
def decoder(in_filename):
args = (
ffmpeg
.input(in_filename)
.filter('fps', fps=30/2)
(... more filters in between ...)
.output('pipe:', format='rawvideo', pix_fmt='rgb24')
.compile()
)
return subprocess.Popen(args, stdout=subprocess.PIPE)
And my encoder after processing the frames with python:
def encoder(out_filename, width, height):
args = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
.filter('minterpolate', fps=30)
.filter('fps',fps=30)
.output(out_filename, pix_fmt='rgb24')
.overwrite_output()
.compile()
)
return subprocess.Popen(args, stdin=subprocess.PIPE)
After that I horizontally stack the original input with the processed video
subprocess.run("ffmpeg -i {} -i {} -filter_complex hstack=inputs=2 -pix_fmt yuv420p -c:v libx264 {}".format(in_filename,out_filename,"out.mp4"))
Here's the problem: The "processed" video is faster and ends before the original. It's like they follow the same timestamp but it never actually interpolated frames. What am I doing wrong?
The encoder .input()
must specify its frame rate to match the decoder framerate (15 frames/s). Currently it's using whatever the default framerate of the rawvideo
encoder.
In FFmpeg, that is to use -r 15
input option. Not sure what it is in ffmpeg-python
but could simply be r=15
keyword argument.
P.S., .filter('fps',fps=30)
is redundant.