Search code examples
csignal-processingfixed-point

Fixed-point: out_Q15 = antilog( in_Q25 ) calculation, avoiding overflow?


I am using a 3rd-party fixed-point antilog() function to calculate magnitude from decibel out_mag = 10^( in_db/20 ). The antilog() takes Q6.25 format as input, and provides Q16.15 at output.

The problem is that antilog() quickly overflows for some higher dB valus, like 100 dB: 10^( 100/20 ) = 100000. The highest value Q16.15 format can have is 2^16-1=65535, so 100000 doesn't fit.

Is there a trick to avoid the overflow? Prescale input value somehow?


Solution

  • I managed to find a solution. It's a bit tricky.

    First, a struct that will hold the output result is needed:

    typedef struct
    {
        q15   v; // Value (within the [MIN, MAX] range for Q16.15).
        int32 s; // Scalefactor.
    } q15_t;
    

    The idea is to provide result as Output with Scalefactor, where

    Output = 10^y
    Scale  = 2^scalefactor
    

    Final output is Output shifted left scalefactor times.

    Here is the math.

    Input Q31 format is dB value scaled to [-1,1] with scale being 2^scalefactor. We need to calculate:

    Out = 10^(2^scalefactor * in/20.0)
    
        = 10^(p+y)                  // rewriting as sum
        = 10^p          * 10^y      // to enable exponent multiplication
        = 2^scalefactor * 10^y      // making it power of 2 to be able to just shift
    

    This way we are not limited with Q16.15 max value.

    We already know 2^scalefactor, but need to find y:

    2^scalefactor * in = p + y
    
    10^p = 2^scalefactor   =>   p = scalefactor*log(2)   // rewrite as power of 2
    2^scalefactor * in = scalefactor*log(2) + y          // replace p
    y = 2^scalefactor*in - scalefactor*log(2)            // and find y
    

    Calculate y, and feed it into antilog.

    If input is 100 dB, then the output magnitude should be 100.000, which doesn't fit into Q16.15 format. Using the above solution, Output = 50.000 (this fits into Q16.15!) and scalefactor = 1, meaning, the final output is 50.000 shifted to left 1 place. This gives 100.000 as the final result. Depending on your implementation, you might get the same result as 25.000 with scalefactor = 2, etc. The idea is there.