I am currently attempting to do something very similar (or maybe the same) as the following question:
Getting variable frequency ranges with androids visualizer class
However, the selected answer has a few bugs, and I'm not a DSP/Audio expert at all and I'm learning as I go.
My goal is to break an FFT that I'm getting from Android Visualizer class into frequency bands. Specifically, these bands:
I have the following code, at the top of my class:
private val targetEndpoints = listOf(0f, 400f, 900f, 1500f, 2300f, 3400f, 5000f, 7300f, 12000f)
private const val CAPTURE_SIZE = 1024
and then, in the method where I'm trying to get the frequency bands for the current track in MediaPlayer
val mp = mediaPlayer!!
val audioSessionId = mp.getAudioSessionId()
val visualizer: Visualizer = Visualizer(audioSessionId)
val captureSizeRange = Visualizer.getCaptureSizeRange().let { it[0]..it[1] }
val captureSize = CAPTURE_SIZE.coerceIn(captureSizeRange)
val captureRate: Int = Visualizer.getMaxCaptureRate()
val isWaveFormRequested: Boolean = false
val isFFTRequested: Boolean = true
val frequencyOrdinalRanges: List<IntProgression> =
targetEndpoints.zipWithNext { a, b ->
val startOrdinal = 1 + (captureSize * a / samplingRate).toInt()
val endOrdinal = (captureSize * b / samplingRate).toInt()
startOrdinal downTo endOrdinal
Now this is the point where things are getting a little murky for me because, like I said, I am no Audio expert.
frequencyOrdinalRanges is a List
with IntProgressions
that go 1 -> 0
For the audio file that I'm using:
captureSize = 1024
samplingRate = 44100000
With those numbers and my frequency bands, pretty much guarantees that the startOrdinal
will always be 1, and endOrdinal
will always be 0.
So my frequencyOrdinalRanges
looks like this:
[1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1]
Then I've got an Listener with a capture rate of 20000 milihertz:
visualizer.setDataCaptureListener(listener, captureRate, isWaveFormRequested, isFFTRequested)
The values for the above call are as follows:
captureRate = 2000 // in milihertz
isWaveFormRequested = false
isFFTRequested = true
The onFftDataCapture
of the listener object looks as follows:
override fun onFftDataCapture(visualizer: Visualizer, bytes: ByteArray, samplingRate: Int) {
var output = DoubleArray(frequencyOrdinalRanges.size)
for ((i, ordinalRange) in frequencyOrdinalRanges.withIndex()) {
var logMagnitudeSum = 0.0
for (k in ordinalRange) {
val fftIndex = k * 2
val currentByte = bytes[fftIndex].toDouble()
val nextByte = bytes[fftIndex + 1].toDouble()
val hypot = Math.hypot(currentByte, nextByte)
val logHypot = Math.log10(hypot)
logMagnitudeSum += logHypot
val result = (logMagnitudeSum / (ordinalRange.last - ordinalRange.first + 1)).toDouble()
output[i] = result
// do something else with output
Now the problem I'm facing with onFftDataCapture
is that this line:
val hypot = Math.hypot(currentByte, nextByte)
it often evaluates to 0, thus making the following line evaluate to -Infinity
and ultimately giving me an array full of Infinity
values which I can't do anything with.
This leads me to believe that I am doing something very wrong, but I am not sure what or how to fix it.
This answer looks more or less what I am trying to do, but then again, I am no expert in audio analysis, so all the finer details totally escape me.
The way to extract 10-band equalization information from mp3 format
Can someone tell me what am I doing wrong? or what am I missing?
The problem with my code was quite silly... I was using samplingRate
in milihertz... the formula expects the sampling rate to be in Hertz.
Dividing samplingRate
by 1000 fixed the problems.
This changes to:
val samplingRateInHz = samplingRate / 1000
val frequencyOrdinalRanges: List<IntRange> =
targetEndpoints.zipWithNext { a, b ->
val startOrdinal = 1 + (captureSize * a / samplingRateInHz).toInt()
val endOrdinal = (captureSize * b / samplingRateInHz).toInt()