Search code examples
bashaudioffmpegsignal-processingsox

Sync two audio files


I have 2 audio files:

  • correct.wav (duration 3:07)
  • incorrect.wav (duration 3:10)

2 audio files

They are almost the same, but was generated with different sound fonts.

The problem: The second file is late for a few seconds.

How can I sync second file with the first one? Maybe there some bash software that could detect first loud sounds appearance in the first sound and compare correct.wav with incorrect.wav, shorten the end of the incorrect.wav file.

I know I can do it manually, but I need automated soulution for a lot of files.

Here is approximate solutions I found:

1) for detecting sound syncing to use this Python script - https://github.com/jeorgen/align-videos-by-sound but it's not perfect, not detecting 100%.

2) use sox for cutting/trimming/comparing/detecting sound durations (code extraction):

length1ok=$(sox correct.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]*\([0-9.]*\)$#\1#p')
length2ok=$(sox incorrect.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]*\([0-9.]*\)$#\1#p')
if [[ $length1ok == $length2ok ]]; then
    echo "Everything OK: $length1ok = $length2ok"
else
    echo "Fatal error: Not the same final files"
fi

diff=$(echo "$length2 - $length1" | bc -l)
echo "difference = $diff"
echo "webm $length1 not greater than fluid2 $length2"
sox correct.wav incorrect.wav pad 0 $diff

Comment to UltrasoundJelly's answer: Here what result I get for your code:

Result

Here what result I need:

Need result


Solution

  • Here's one solution:

    • Use ffmpeg to find the leading silence in each file
    • If the new file has a longer leading silence, trim the difference with sox
    • If the new file has a shorter leading silence, pad the start with sox
    • Trim the new file to the same length as the original with sox

    Bash Script:

    FILEONE=$1
    FILETWO=$2
    MINSILENCE=0.1
    THRESH="-50dB"
    S1=$(ffmpeg -i $FILEONE -af silencedetect=noise=$THRESH:d=$MINSILENCE -f null -  2>&1 | grep silence_duration -m 1 | awk '{print $NF}')
    S2=$(ffmpeg -i $FILETWO -af silencedetect=noise=$THRESH:d=$MINSILENCE -f null -  2>&1 | grep silence_duration -m 1 | awk '{print $NF}')
    if [ -z "$S1" ]; then echo "no starting silence found in $FILEONE" && exit 1;fi
    if [ -z "$S2" ]; then echo "no starting silence found in $FILETWO" && exit 1;fi
    DIFF=$(echo "$S1-$S2"|bc)
    ISNEG=$(echo $DIFF'>0'| bc -l)
    DIFF=${DIFF#-}
    BASE="${FILETWO%.*}"
    if [ $ISNEG -eq 1 ]
    then
      echo "$1>$2 ... padding $2"
      SAMPRATE=$(sox --i -r $FILETWO)
      sox -n -r $SAMPRATE -c 2 silence.wav trim 0.0 $DIFF
      sox silence.wav $FILETWO $BASE.shift.wav
      rm silence.wav
    else
      echo "$1<$2 ... trimming $2"
      sox $FILETWO $BASE.trim.wav trim $DIFF
    fi
    
    length1=$(sox $FILEONE -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]*\([0-9.]*\)$#\1#p')
    length2=$(sox $BASE.trim.wav -n stat 2>&1 | sed -n 's#^Length (seconds):[^0-9]*\([0-9.]*\)$#\1#p')
    
    if (( $(echo "$length2 > $length1" | bc -l) )); then
        diff=$(echo "$length2 - $length1" | bc -l)
        echo "difference = $diff"
        sox $BASE.trim.wav finished.wav trim 0 -$diff
    fi