I am using the FMOD library to apply FFT to an audio stream, providing me with a constantly updating fixed number of frequency bins. Each bin represents an equal frequency range, with a value between 0 and 1 to represent the intensity of this range from the processed audio. FMOD documentation states that these values can be represented in decibels, where val is the value between 0 and 1:
Decibels = 10.0f * (float)log10(val) * 2.0f
I am attempting to make an automated strobe-like beat detecting visualisation. So far, I test at a constant interval to see whether a particular frequency bin's intensity value surpasses a specified boundary - if this is the case, the strobe flashes. Although a pretty crude way of doing this, it works fairly effectively for my requirements.
However, this specified boundary only works effectively when the system/music player's volumes are maximum. When I reduce either volume, the strobe sensitivity is reduced and becomes either very inaccurate or stops flashing completely. I assume that I need to normalise the data in some way so analysis is performed independent of volume, though by scaling the data by 1/value of largest bin
the largest value is always maxed out. This surpasses the specified boundary permanently, causing the strobe to flash indefinitely. I can't think how else this can be achieved and have been on a mental block for days - any help or a point in the right direction would be greatly appreciated!
Normalise over a a longer scale. You'll need something like an envelope follower with a long release time.
If you search for 'compressor' source code, or automatic gain control something will definitely turn up.
But broadly in pseudo C++, and working on your incoming audio (the time-domain signal before the FFT):
auto instant_level = std::abs(signal);
peak_level *= 0.99f;
peak_level = peak_level > instant_level ? peak_level : instant_level;
Now peak_level decays slowly over time. And you can use this to calculate a gain factor to normalize your incoming audio. Adjust the 0.99f as required for a sensible decay time and for the correct sample rate.
There's also a Signal Processing stack exchange site where you'll get quicker answers to these kinds of questions (although occasionally with an almost incomprehensible piece of algebra attached :) )