I'm analyzing audio signals on Android. First tried with MIC and succeeded. Now I'm trying to apply FFT on MP3 data comes from Visualizer.OnDataCaptureListener
's* onWaveFormDataCapture
method which is linked to MediaPlayer
. There is a byte array called byte[] waveform
which I get spectral leakage or overlap when apply FFT on this data.
public void onWaveFormDataCapture(Visualizer visualizer, byte[] waveform, int samplingRate)
I tried to convert the data into -1..1 range by using the code below in a for loop;
// waveform varies in range of -128..+127
raw[i] = (double) waveform[i];
// change it to range -1..1
raw[i] /= 128.0;
Then I copy the raw
into fft buffer;
fftre[i] = raw[i];
fftim[i] = 0;
Then I call the fft function;
fft.fft(fftre, fftim); // in: audio signal, out: fft data
As final process I convert them into magnitudes in dB then draw freqs on screen
// Ignore the first fft data which is DC component
for (i = 1, j = 0; i < waveform.length / 2; i++, j++)
{
magnitude = (fftre[i] * fftre[i] + fftim[i] * fftim[i]);
magnitudes[j] = 20.0 * Math.log10(Math.sqrt(magnitude) + 1e-5); // [dB]
}
When I play a sweep signal from 20Hz to 20kHz, I don't see what I see on MIC. It doesn't draw a single walking line, but several symmetric lines going far or coming near. Somehow there is a weaker symmetric signal on other end of the visualizer. The same code which using 32768 instead of 128 on division works very well on MIC input with AudioRecord.
Where am I doing wrong? (and yes, I know there is a direct fft output)
The input audio is 8-bit unsigned mono. The line raw[i] = (double) waveform[i]
causes an unintentional unsigned-to-signed conversion, and since raw
is biased to approximately a 128 DC level, a small sine wave ends up getting changed into a high-amplitude modified square wave, as the signal crosses the 127/-128 boundary. That causes a bunch of funny harmonics (which caused the "symmetric lines coming and going" you were talking about).
Solution
Change to (double) (waveform[i] & 0xFF)
so that the converted value lies in the range 0..255, instead of -128..127.