I've encountered an issue while attempting to convert an audio downloaded from a youtube video into a mono .wav format using ffmpeg
. Despite using typical conversion commands (code below), the resulting audio is either silent or corrupted, indicating potential stereo channel cancellation.
I tried to load the audio into Python and calculate phase and cross-correlation, but this step uses a lot of memory, especially for a 6-hour-long audio.
I'm seeking a method, preferably using ffmpeg
, to detect if stereo channels are canceling each other out due to phase misalignment. Additionally, I aim to automate the process of phase inversion before converting to mono to ensure the integrity of the resulting audio.
How can I automatically detect if stereo channels are canceling each other and subsequently invert the phase to ensure successful conversion to mono using ffmpeg
? Any insights, solutions, or alternative approaches would be greatly appreciated.
yt-dlp
.ffmpeg
, resulting in silent or corrupted output.yt-dlp -f bestaudio https://www.youtube.com/watch?v=s3QB_rJzH08 -o input.webm
The resulting file is silent or corrupted
ffmpeg -i input.webm -ac 1 -ar 16000 -c:a pcm_s16le output.wav
The resulting file has audio
ffmpeg -i input.webm -af "aeval=val(0)|-val(1)" -ac 1 -ar 16000 -c:a pcm_s16le output.wav
FFmpeg has a phase difference detection filter.
ffmpeg -i input.webm -af "aphasemeter=video=0:phasing=1:angle=90:duration=1:tolerance=0.001,ametadata=print:file=aphase.log" -vn -f null -
The file aphase.log
will contain a per-frame log of phase difference in the format
frame:17302 pts:19931904 pts_time:451.971
lavfi.aphasemeter.phase=0.224391
The phase value can range from -1 to +1. -1
indicates the channels are completely out of phase and 1
means channels are in phase.
If the channels are out of phase, ffmpeg will also print a summary reading in the console log.