Search code examples
androidkotlinaudiosignal-processingfft

Retrieving frequency ranges from android visualizer FFT


I am currently attempting to do something very similar (or maybe the same) as the following question:

Getting variable frequency ranges with androids visualizer class

However, the selected answer has a few bugs, and I'm not a DSP/Audio expert at all and I'm learning as I go.

My goal is to break an FFT that I'm getting from Android Visualizer class into frequency bands. Specifically, these bands:

  1. 0Hz - 400Hz
  2. 400Hz - 900Hz
  3. 900Hz - 1500Hz
  4. 1500Hz - 2300Hz
  5. 2300Hz - 3400Hz
  6. 3400Hz - 5000Hz
  7. 5000Hz - 7300Hz
  8. 7300Hz - 12000Hz

I have the following code, at the top of my class:

private val targetEndpoints = listOf(0f, 400f, 900f, 1500f, 2300f, 3400f, 5000f, 7300f, 12000f)
private const val CAPTURE_SIZE = 1024

and then, in the method where I'm trying to get the frequency bands for the current track in MediaPlayer:

    val mp = mediaPlayer!!
    val audioSessionId = mp.getAudioSessionId()
    val visualizer: Visualizer = Visualizer(audioSessionId)
    val captureSizeRange = Visualizer.getCaptureSizeRange().let { it[0]..it[1] }
    val captureSize = CAPTURE_SIZE.coerceIn(captureSizeRange)
    val captureRate: Int = Visualizer.getMaxCaptureRate()
    val isWaveFormRequested: Boolean = false
    val isFFTRequested: Boolean = true

    visualizer.setCaptureSize(captureSize)

    val frequencyOrdinalRanges: List<IntProgression> =
        targetEndpoints.zipWithNext { a, b ->
          val startOrdinal = 1 + (captureSize * a / samplingRate).toInt()
          val endOrdinal = (captureSize * b / samplingRate).toInt()
          startOrdinal downTo endOrdinal
        }

Now this is the point where things are getting a little murky for me because, like I said, I am no Audio expert.

frequencyOrdinalRanges is a List with IntProgressions that go 1 -> 0

For the audio file that I'm using:

captureSize = 1024
samplingRate = 44100000

With those numbers and my frequency bands, pretty much guarantees that the startOrdinal will always be 1, and endOrdinal will always be 0.

So my frequencyOrdinalRanges looks like this:

[1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1, 1 downTo 0 step 1]

Then I've got an Listener with a capture rate of 20000 milihertz:

visualizer.setDataCaptureListener(listener, captureRate, isWaveFormRequested, isFFTRequested)

The values for the above call are as follows:

captureRate = 2000 // in milihertz
isWaveFormRequested = false
isFFTRequested = true

The onFftDataCapture of the listener object looks as follows:

override fun onFftDataCapture(visualizer: Visualizer, bytes: ByteArray, samplingRate: Int) {
  var output = DoubleArray(frequencyOrdinalRanges.size)
  for ((i, ordinalRange) in frequencyOrdinalRanges.withIndex()) {
    var logMagnitudeSum = 0.0
    
      for (k in ordinalRange) {
        val fftIndex = k * 2
        val currentByte = bytes[fftIndex].toDouble()
        val nextByte = bytes[fftIndex + 1].toDouble()
        val hypot = Math.hypot(currentByte, nextByte)
        val logHypot = Math.log10(hypot)
        logMagnitudeSum += logHypot
        val result = (logMagnitudeSum / (ordinalRange.last - ordinalRange.first + 1)).toDouble()
        output[i] = result
      }
  // do something else with output
}

Now the problem I'm facing with onFftDataCapture is that this line:

val hypot = Math.hypot(currentByte, nextByte)

it often evaluates to 0, thus making the following line evaluate to -Infinity and ultimately giving me an array full of Infinity values which I can't do anything with.

This leads me to believe that I am doing something very wrong, but I am not sure what or how to fix it.

This answer looks more or less what I am trying to do, but then again, I am no expert in audio analysis, so all the finer details totally escape me.

The way to extract 10-band equalization information from mp3 format

Can someone tell me what am I doing wrong? or what am I missing?


Solution

  • The problem with my code was quite silly... I was using samplingRate in milihertz... the formula expects the sampling rate to be in Hertz.

    Dividing samplingRate by 1000 fixed the problems.

    This changes to:

       val samplingRateInHz = samplingRate / 1000
       val frequencyOrdinalRanges: List<IntRange> =
            targetEndpoints.zipWithNext { a, b ->
              val startOrdinal = 1 + (captureSize * a / samplingRateInHz).toInt()
              val endOrdinal = (captureSize * b / samplingRateInHz).toInt()
              startOrdinal..endOrdinal
            }