Search code examples
audiovideoffmpegmuxer

Merge 2 video files with their audio in a single file side by side


I need to merge 2 video files along with their audio in a single file, side by side.

I am able to merge 2 video files in single file but it's using the audio from first video file only, whereas I need to merge the audio of the second file as well.

Here's how I'm trying to do it:

ffmpeg.exe -i input1.webm -vf "[in] scale=iw/2:ih/2, pad=2*iw:ih [left]; movie=input2.webm, scale=iw/2:ih/2 [right]; [left][right] overlay=main_w/2:0 [out]" -b:v 768k ouput.webm

I have tried various way with amerge but no success. As I am new to FFMpeg, I am not sure how can I achieve this.

EDIT

Below is FFMpeg Command I have used for merging both files as suggested by @occvtech, but it's still not merging the second audio stream.

ffmpeg.exe -i 3.mp4 -i 4.mp4 -filter_complex "[0:v] scale=iw/2:ih/2,pad=2*iw:ih[left];[1:v]scale=iw/2:ih/2[right];[left][right‌​]overlay=main_w/2:0 [out]" -map [out] -map 0:a -map 1:a -b:v 768k o5.mp4

Below is the console output:

ffmpeg version N-72276-gf99fed7 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 4.9.2 (GCC)
  configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-av
isynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enab
le-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --
enable-libdcadec --enable-libfreetype --enable-libgme --enable-libgsm --enable-l
ibilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enab
le-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --en
able-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --ena
ble-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc
 --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enabl
e-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-lzma --ena
ble-decklink --enable-zlib
  libavutil      54. 23.101 / 54. 23.101
  libavcodec     56. 39.101 / 56. 39.101
  libavformat    56. 33.101 / 56. 33.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 16.101 /  5. 16.101
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '3.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.23.106
  Duration: 00:00:13.52, start: 0.023220, bitrate: 968 kb/s
    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yu
v420p, 640x480, 889 kb/s, 10 fps, 10 tbr, 10240 tbn, 20 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp,
 76 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '4.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.23.106
  Duration: 00:00:14.92, start: 0.023220, bitrate: 1049 kb/s
    Stream #1:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yu
v420p, 640x480, 971 kb/s, 10 fps, 10 tbr, 10240 tbn, 20 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #1:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp,
 75 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
[libx264 @ 040d20e0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 040d20e0] profile High, level 2.1
[libx264 @ 040d20e0] 264 - core 146 r2538 121396c - H.264/MPEG-4 AVC codec - Cop
yleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deb
lock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 m
e_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chro
ma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 i
nterlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=10 scenec
ut=40 intra_refresh=0 rc_lookahead=40 rc=abr mbtree=1 bitrate=768 ratetol=1.0 qc
omp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'o5.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.33.101
    Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 640x24
0, q=-1--1, 768 kb/s, 10 fps, 10240 tbn, 10 tbc (default)
    Metadata:
      encoder         : Lavc56.39.101 libx264
    Stream #0:1(und): Audio: aac (libvo_aacenc) ([64][0][0][0] / 0x0040), 44100
Hz, mono, s16, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc56.39.101 libvo_aacenc
Stream mapping:
  Stream #0:0 (h264) -> scale (graph 0)
  Stream #1:0 (h264) -> scale (graph 0)
  overlay (graph 0) -> Stream #0:0 (libx264)
  Stream #1:1 -> #0:1 (aac (native) -> aac (libvo_aacenc))
Press [q] to stop, [?] for help
frame=   27 fps= 26 q=0.0 size=       0kB time=00:00:03.00 bitrate=   0.1kbits/s
frame=   46 fps= 28 q=0.0 size=       0kB time=00:00:04.90 bitrate=   0.1kbits/s
frame=   80 fps= 37 q=16.0 size=     303kB time=00:00:08.29 bitrate= 298.8kbits/
frame=  117 fps= 43 q=15.0 size=     717kB time=00:00:11.99 bitrate= 489.9kbits/
Past duration 0.767570 too large
frame=  137 fps= 39 q=-1.0 Lsize=    1488kB time=00:00:14.94 bitrate= 815.8kbits
/s dup=0 drop=12
video:1247kB audio:234kB subtitle:0kB other streams:0kB global headers:0kB muxin
g overhead: 0.504372%
[libx264 @ 040d20e0] frame I:1     Avg QP:10.60  size: 30577
[libx264 @ 040d20e0] frame P:36    Avg QP:10.78  size: 20384
[libx264 @ 040d20e0] frame B:100   Avg QP:15.41  size:  5116
[libx264 @ 040d20e0] consecutive B-frames:  0.7%  0.0% 17.5% 81.8%
[libx264 @ 040d20e0] mb I  I16..4: 13.2% 49.8% 37.0%
[libx264 @ 040d20e0] mb P  I16..4:  0.8% 10.7%  1.9%  P16..4: 17.7% 30.4% 29.0%
 0.0%  0.0%    skip: 9.6%
[libx264 @ 040d20e0] mb B  I16..4:  0.0%  1.1%  0.0%  B16..8: 27.0% 20.2% 10.1%
 direct:15.3%  skip:26.3%  L0:35.6% L1:31.1% BI:33.3%
[libx264 @ 040d20e0] final ratefactor: 13.38
[libx264 @ 040d20e0] 8x8 transform intra:77.9% inter:38.4%
[libx264 @ 040d20e0] coded y,uvDC,uvAC intra: 96.3% 97.8% 97.5% inter: 46.4% 68.
0% 41.1%
[libx264 @ 040d20e0] i16 v,h,dc,p: 44%  4%  7% 44%
[libx264 @ 040d20e0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 41% 10% 19%  3%  6%  5%  5%
 6%  5%
[libx264 @ 040d20e0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 16%  9%  5% 12% 10%  9%
 7%  6%
[libx264 @ 040d20e0] i8c dc,h,v,p: 66% 12% 12% 10%
[libx264 @ 040d20e0] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 040d20e0] ref P L0: 46.3% 12.0% 28.7% 13.0%
[libx264 @ 040d20e0] ref B L0: 81.1% 16.0%  2.9%
[libx264 @ 040d20e0] ref B L1: 93.2%  6.8%
[libx264 @ 040d20e0] kb/s:745.10

Solution

  • You need to add the amerge audio filter to combine both audio streams into one:

    ffmpeg -i input0.mp4 -i input1.mp4 -filter_complex \
    "[0:v]scale=iw/2:-1,setpts=PTS-STARTPTS[left]; \
     [1:v]scale=iw/2:-1,setpts=PTS-STARTPTS[right]; \
     [left][right​]hstack[v]; \
     [0:a][1:a]amerge=inputs=2[a]" \
    -map "[v]" -map "[a]" output
    
    • This will make a stereo output with the audio from input0.mp4 in the left channel, and the audio from input1.mp4 in the right channel. You did not specify how you want the resulting channels arranged, but if you prefer a mono output then add -ac 1 before the output file name.

    • hstack can replace pad + overlay. hstack is simpler and likely faster.