I am using the following command to mute parts of the audio file with ffmpeg:
ffmpeg.exe -y -i "C:\temp\inputfile3.wav" -filter_complex_script "C:\temp\filter_complex_cmds.txt" "C:\temp\outputfile3.wav"
Inside filter_complex_cmd.txt I have:
[0]aformat=sample_fmts=fltp:sample_rates=44100:channel_layouts=stereo,volume='if(between(t,0.95,1.05),0*(t-0.95) + 0,1)':eval=frame,volume='if(between(t,1.15,1.25),0*(t-1.15) + 0,1)':eval=frame,volume='if(between(t,1.41,1.49),0*(t-1.41) + 0,1)':eval=frame,volume='if(between(t,2.10,2.35),0*(t-2.10) + 0,1)':eval=frame,volume='if(between(t,2.75,2.85),0*(t-2.75) + 0,1)':eval=frame, etc. x1000...
You get the idea - many, many calls to reduce volume at specific times. However, when I look at outputfile3.wav, the volume is not reduced from 0.95000 to 1.05000 seconds, but instead from 0.952018 to 1.068118 (i.e. off by 2 and 18 milliseconds respectively) and not from 1.150000 to 1.250000 seconds but from 1.160998 to 1.253878 milliseconds, etc. It can sometimes be off by as many as 20 milliseconds.
Can anyone tell me what's going on and what to do to make it precise?
Also, while I can reserve it for a separate question, but I also want to be able to find a fade out/fade in commands that would make this change to and from silence a bit more smooth by dialing it down and up over the course of 20 milliseconds around my "volume 0" times.
Audio sample data is framed, and the volume filter operates upon whole frames. The frame size depends upon the source. For LPCM audio, what a WAV would typically have, frame size is 1024 samples. Since your sample rate is 44100 after aformat
, frame X's timestamp will be X * 1024/44100
seconds.
You can reframe the data to get more precise targetting by using asetnsamples i.e. aformat=...,asetnsamples=44,volume=...
. Since sample rate/1000
is not an integer, there will remain a small imprecision.