python ffmpeg video-processing ffmpeg-python

FFmpeg-Python audio getting dropped in final video

Im trying to place a video on top of a background image but the output video is getting generated without audio. Is there any way to keep the audio as it is?

def ConvertVideo(source,background,start,end,dest):
    stream = ffmpeg.input(source)
    strea1 = ffmpeg.input(background)
    duration = end - start
    stream = stream.trim(start=start,duration=duration).filter('setpts', 'PTS-STARTPTS')
    stream = stream.crop(100,0,1080,720)
    stream = ffmpeg.overlay(strea1,stream,x=0,y=180)
    stream = stream.output(dest)

Does anyone know why audio gets dropped? Is there any workaround to this problem?

Solution

We have to add the audio stream explicitly.
Something like: ffmpeg.output(overlaid_vid_stream, audio_stream, dest).

Start by creating synthetic input video and background using FFmpeg CLI (just for making the posted solution reproducible and "self contained").

Create input video with audio:

 ffmpeg -y -f lavfi -i testsrc=size=128x72:rate=1 -f lavfi -i sine=frequency=400 -f lavfi -i sine=frequency=1000 -filter_complex "[1:a][2:a]amix=inputs=2" -vcodec libx264 -g 10 -crf 17 -pix_fmt yuv420p -acodec aac -ar 22050 -t 50 in.mp4

Create yellow background image:

 ffmpeg.exe -y -f lavfi -i color=yellow:size=128x72 -frames:v 1 background.png

For the example, the size is 128x72 (assuming your input is 1280x720).

The following code sample applies trim, setpts, crop and overlay filters, and adds the source audio:

source = 'in.mp4'
background = 'background.png'
dest = 'out.mp4'
end = 40
start = 10

duration = end - start

vid_stream = ffmpeg.input(source).video  # Source video stream
audio_stream = ffmpeg.input(source).audio  # Source audio stream
vid_background = ffmpeg.input(background).video  # Background video stream

trimed_vid_stream = vid_stream.trim(start=start, duration=duration).filter('setpts', 'PTS-STARTPTS').crop(10, 0, 108, 72)  # Source video stream after trimming and cropping
overlaid_vid_stream = ffmpeg.overlay(vid_background, trimed_vid_stream, x=0, y=18)  # Video stream overlay of background and trimed_vid_stream

trimed_audio_stream = audio_stream.filter('atrim', start=start, duration=duration).filter('asetpts', 'PTS-STARTPTS')  # Trimming the audio
output_video_and_audio = ffmpeg.output(overlaid_vid_stream, trimed_audio_stream, dest)  # Output - video applies overlaid_vid_stream, and audio applies trimmed source audio

output_video_and_audio.overwrite_output().run() # Execute FFmpeg

I modified your naming to be more meaningful (naming everything stream makes it difficult to follow).
I also removed the method, and set arguments to specific values (just for making the posted solution reproducible).

The solution applies the following main stages:

Create references to the video and the audio streams:

 vid_stream = ffmpeg.input(source).video  # Source video stream
 audio_stream = ffmpeg.input(source).audio  # Source audio stream

Define video filters:

 vid_background = ffmpeg.input(background).video  # Background video stream
 trimed_vid_stream = vid_stream.trim(start=start, duration=duration).filter('setpts', 'PTS-STARTPTS').crop(10, 0, 108, 72)  # Source video stream after trimming and cropping
 overlaid_vid_stream = ffmpeg.overlay(vid_background, trimed_vid_stream, x=0, y=18)  # Video stream overlay of background and trimed_vid_stream

Define audio filters (trimming the audio):

 trimed_audio_stream = audio_stream.filter('atrim', start=start, duration=duration).filter('asetpts', 'PTS-STARTPTS')  # Trimming the audio

Define the output to include the "filtered" video and the trimmed audio:

 output_video_and_audio = ffmpeg.output(overlaid_vid_stream, trimed_audio_stream, dest)  # Output - video applies overlaid_vid_stream, and audio applies trimmed source audio

Execute FFmpeg:

 output_video_and_audio.overwrite_output().run() # Execute FFmpeg

Sample output frame:

The video includes a beeping audio...