Search code examples
audioffmpegsignal-processing

Is there a way to (automatically) detect if the channels of a stereo video/audio are out of phase and canceling each other?


Background

I've encountered an issue while attempting to convert an audio downloaded from a youtube video into a mono .wav format using ffmpeg. Despite using typical conversion commands (code below), the resulting audio is either silent or corrupted, indicating potential stereo channel cancellation. I tried to load the audio into Python and calculate phase and cross-correlation, but this step uses a lot of memory, especially for a 6-hour-long audio.

Goal

I'm seeking a method, preferably using ffmpeg, to detect if stereo channels are canceling each other out due to phase misalignment. Additionally, I aim to automate the process of phase inversion before converting to mono to ensure the integrity of the resulting audio.

Question:

How can I automatically detect if stereo channels are canceling each other and subsequently invert the phase to ensure successful conversion to mono using ffmpeg? Any insights, solutions, or alternative approaches would be greatly appreciated.

Steps Taken:

  1. Downloaded audio from YouTube using yt-dlp.
  2. Attempted audio conversion to mono using ffmpeg, resulting in silent or corrupted output.
  3. Successfully converted audio to mono by manually inverting the phase of one channel prior to conversion.

Download audio from youtube

yt-dlp -f bestaudio https://www.youtube.com/watch?v=s3QB_rJzH08 -o input.webm

Audio conversion

The resulting file is silent or corrupted

ffmpeg -i input.webm  -ac 1 -ar 16000 -c:a pcm_s16le output.wav

Audio conversion to mono .wav with phase inversion

The resulting file has audio

ffmpeg -i input.webm -af "aeval=val(0)|-val(1)" -ac 1 -ar 16000 -c:a pcm_s16le output.wav

Solution

  • FFmpeg has a phase difference detection filter.

    ffmpeg -i input.webm -af "aphasemeter=video=0:phasing=1:angle=90:duration=1:tolerance=0.001,ametadata=print:file=aphase.log" -vn -f null -

    The file aphase.log will contain a per-frame log of phase difference in the format

    frame:17302 pts:19931904 pts_time:451.971
    lavfi.aphasemeter.phase=0.224391
    

    The phase value can range from -1 to +1. -1 indicates the channels are completely out of phase and 1 means channels are in phase.

    If the channels are out of phase, ffmpeg will also print a summary reading in the console log.