Search code examples
audioffmpegvolume

Why every audio part is louder in FFmpeg when I join them in one audio?


I trying to make dubbing for audio. I have original audio track and I want to put translated audio parts on top of the original.

translated audio 100% vol: --p1--- ---p2-- -----p3--- --p4--

original audio 5% vol: -----------------------------------------

Here is my FFmpeg command with filter_complex

ffmpeg -i video_wpmXlZF4XiE.opus -i 989-audio.mp3 -i 989-audio.mp3 -i 989-audio.mp3 -i 989-audio.mp3 \
-filter_complex "\
[0:a]loudnorm=I=-14:TP=-2:LRA=7, volume=0.05[original]; \
[1:a]loudnorm=I=-14:TP=-2:LRA=7, adelay=5000|5000, volume=1.0[sent1]; \
[2:a]loudnorm=I=-14:TP=-2:LRA=7, adelay=10000|10000, volume=1.0[sent2]; \
[3:a]loudnorm=I=-14:TP=-2:LRA=7, adelay=20000|20000, volume=1.0[sent3]; \
[4:a]loudnorm=I=-14:TP=-2:LRA=7, adelay=30000|30000, volume=1.0[sent4]; \
[original][sent1][sent2][sent3][sent4]amix=inputs=5:duration=longest[out]" \
-map "[out]" output.mp3

Audios I put on top of the original audio track is the same -i 989-audio.mp3 I made it by purpose to show the problem And here is the audio levels on final generated track. enter image description here

As you can see, first and second only slightly different but third and fourth have totally different(higher) volume level (Notice, audio is the same). Why it's happened? And how can I workaround this odd behaviour?


Solution

  • amix filter does not, by default, mix the inputs directly but adjusts their volume depending on the number of active inputs at that instant, as per the scheme described in this answer. You can avoid this adjustment by adding the option normalize=0.