I'm trying to merge mp4 and mp3 files with ffmpeg. mp4 duration - 9.800 sec, mp3 - 58.540 sec. So i using -shortest key. Code:
ffmpeg -i video.mp4 -i audio.mp3 -c:v libx264 -c:a aac -strict experimental -shortest output.mp4
After that i got output.mp4 with duration 9.846. Where is my error? Why output video longer than source? (9.846 sec and 9.800 sec).
Source mp4 MediaInfo:
General
Complete name : F:\video test\video.mp4
Format : MPEG-4
Format profile : Base Media
Codec ID : iso5 (iso5/dash)
File size : 3.19 MiB
Duration : 9 s 800 ms
Overall bit rate : 2 732 kb/s
Encoded date : UTC 2017-11-24 20:53:53
Tagged date : UTC 2017-11-24 20:53:53
Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : [email protected]
Format settings : CABAC / 4 Ref Frames
Format settings, CABAC : Yes
Format settings, ReFrames : 4 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 9 s 800 ms
Bit rate : 2 729 kb/s
Maximum bit rate : 3 766 kb/s
Width : 1 280 pixels
Height : 720 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 25.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.118
Stream size : 3.19 MiB (100%)
Writing library : x264 core 146
Encoding settings : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=25 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=23.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
Tagged date : UTC 2017-11-24 20:53:53
Source mp3 Mediainfo:
General
Complete name : F:\video test\audio.mp3
Format : MPEG Audio
File size : 1.19 MiB
Duration : 58 s 540 ms
Overall bit rate mode : Variable
Overall bit rate : 170 kb/s
Writing library : LAME3.99r
Audio
Format : MPEG Audio
Format version : Version 1
Format profile : Layer 3
Format settings : Joint stereo / MS Stereo
Duration : 58 s 540 ms
Bit rate mode : Variable
Bit rate : 170 kb/s
Minimum bit rate : 32.0 kb/s
Channel(s) : 2 channels
Sampling rate : 44.1 kHz
Frame rate : 38.281 FPS (1152 SPF)
Compression mode : Lossy
Stream size : 1.19 MiB (100%)
Writing library : LAME3.99r
Encoding settings : -m j -V 2 -q 0 -lowpass 18.5 --vbr-new -b 32
Console output:
ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 7.2.0 (GCC)
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-cuda --enable-cuvid --enable-d3d11va --enable-nvenc --enable-dxva2 --enable-avisynth --enable-libmfx
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
Metadata:
major_brand : iso5
minor_version : 1
compatible_brands: iso5dash
creation_time : 2017-11-24T20:53:53.000000Z
Duration: 00:00:09.80, start: 0.000000, bitrate: 2732 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 2259 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
handler_name : VideoHandler
Input #1, mp3, from 'audio.mp3':
Duration: 00:00:58.54, start: 0.025057, bitrate: 170 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, stereo, s16p, 170 kb/s
Metadata:
encoder : LAME3.99r
Side data:
replaygain: track gain - -2.200000, track peak - unknown, album gain - unknown, album peak - unknown,
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Stream #1:0 -> #0:1 (mp3 (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 00000000005ab440] using SAR=1/1
[libx264 @ 00000000005ab440] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 00000000005ab440] profile High, level 3.1
[libx264 @ 00000000005ab440] 264 - core 152 r2851 ba24899 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
Metadata:
major_brand : iso5
minor_version : 1
compatible_brands: iso5dash
encoder : Lavf57.83.100
Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc57.107.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s
Metadata:
encoder : Lavc57.107.100 aac
Side data:
replaygain: track gain - -2.200000, track peak - unknown, album gain - unknown, album peak - unknown,
frame= 54 fps=0.0 q=28.0 size= 0kB time=00:00:00.04 bitrate= 8.3kbits/s speed=0.0927x
frame= 80 fps= 80 q=28.0 size= 0kB time=00:00:01.09 bitrate= 0.4kbits/s speed=1.09x
frame= 98 fps= 65 q=28.0 size= 256kB time=00:00:01.83 bitrate=1143.5kbits/s speed=1.21x
frame= 119 fps= 59 q=28.0 size= 512kB time=00:00:02.67 bitrate=1570.9kbits/s speed=1.32x
frame= 144 fps= 56 q=28.0 size= 768kB time=00:00:03.66 bitrate=1715.0kbits/s speed=1.42x
frame= 167 fps= 52 q=28.0 size= 1024kB time=00:00:04.57 bitrate=1833.9kbits/s speed=1.44x
frame= 190 fps= 51 q=28.0 size= 1280kB time=00:00:05.50 bitrate=1905.5kbits/s speed=1.47x
frame= 218 fps= 51 q=28.0 size= 1792kB time=00:00:06.64 bitrate=2210.6kbits/s speed=1.56x
frame= 242 fps= 50 q=28.0 size= 2048kB time=00:00:07.56 bitrate=2216.4kbits/s speed=1.58x
frame= 245 fps= 41 q=-1.0 Lsize= 3045kB time=00:00:09.82 bitrate=2539.6kbits/s speed=1.65x
video:2880kB audio:156kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.298058%
[libx264 @ 00000000005ab440] frame I:14 Avg QP:20.01 size: 39750
[libx264 @ 00000000005ab440] frame P:106 Avg QP:23.85 size: 14578
[libx264 @ 00000000005ab440] frame B:125 Avg QP:24.63 size: 6770
[libx264 @ 00000000005ab440] consecutive B-frames: 22.9% 22.0% 15.9% 39.2%
[libx264 @ 00000000005ab440] mb I I16..4: 16.7% 80.3% 3.0%
[libx264 @ 00000000005ab440] mb P I16..4: 10.2% 36.2% 1.1% P16..4: 25.0% 7.9% 2.5% 0.0% 0.0% skip:17.1%
[libx264 @ 00000000005ab440] mb B I16..4: 2.3% 5.8% 0.2% B16..8: 31.4% 6.5% 0.9% direct: 3.7% skip:49.2% L0:51.8% L1:44.5% BI: 3.7%
[libx264 @ 00000000005ab440] 8x8 transform intra:76.1% inter:86.3%
[libx264 @ 00000000005ab440] coded y,uvDC,uvAC intra: 38.3% 52.1% 9.0% inter: 12.3% 20.1% 0.2%
[libx264 @ 00000000005ab440] i16 v,h,dc,p: 30% 28% 9% 33%
[libx264 @ 00000000005ab440] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 36% 23% 19% 3% 3% 4% 4% 4% 4%
[libx264 @ 00000000005ab440] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 33% 21% 14% 5% 7% 7% 6% 5% 3%
[libx264 @ 00000000005ab440] i8c dc,h,v,p: 45% 24% 25% 6%
[libx264 @ 00000000005ab440] Weighted P-Frames: Y:13.2% UV:6.6%
[libx264 @ 00000000005ab440] ref P L0: 71.7% 12.5% 12.9% 2.7% 0.2%
[libx264 @ 00000000005ab440] ref B L0: 92.8% 6.3% 0.9%
[libx264 @ 00000000005ab440] ref B L1: 98.3% 1.7%
[libx264 @ 00000000005ab440] kb/s:2406.56
[aac @ 00000000005adde0] Qavg: 511.420
ffprobe -show_packets output too big, so I loaded to pastebin https://pastebin.com/TYSMdceS
A quick answer to your question is that FFmpeg / libaac encodes an extra aac priming packet at the beginning, starting at -0.0213 s. That adds to your duration.
I will try to give a detailed answered later if that would help.
You can try ffprobe -show_packets output.mp4
.
I looked into the packets dump you shared. You video packets looks like
dts: -0.08 | pts: 0.0
dts: -0.04 | pts: 0.12
dts: 0.0 | pts: 0.04
dts: 0.04 | pts: 0.08
dts: 0.08 | pts: 0.24
...
dts: 9.64 | pts: 9.76
dts: 9.68 | pts: 9.72
The back and forth pts values are possibly because u have B frames with I B B P
order.
Your video stream is 25 fps
, which makes 1 frame duration = 0.04 s
.
That makes your video 9.76 + 0.04(frame duration) = 9.8 s
.
You original audio is larger than the video, so it would be truncated to have the last packet up to 9.80 s or later
.
Your audio packets look like
pts: -0.023220 (AAC priming data)
pts: 0.0
pts: 0.023220
...
pts: 9.775601 | duration: 0.023220
pts: 9.798821 | duration: 0.023175
You last audio packet has to end at 9.80 or after. That's why the packet at 9.79 is accepted.
So your duration of audio muxed into the AV stream is
0.02322 (primiing pkt) + 9.798821 + 0.023175 (dur) = 9.845216
I am not sure where the extra 0.001 s comes from. Someone else should be able to comment. There's skip data I see at the beginning.
[SIDE_DATA]
side_data_type=Skip Samples
skip_samples=1024
discard_padding=0
skip_reason=0
discard_reason=0
[/SIDE_DATA]
I hope this helps.