Search code examples
filterfixed-point

Fixed-point FIR Filter Output Size


I have a 64-tap FIR filter whose output format I am having trouble understanding. The filter has been implemented using (signed) fixed-point math. In {B,F} format, where B is the word length, and F is the fraction length, the filter inputs are {16,0}, and the coefficients are {16,17}. The heart of the filter is as follows:

for (i = 0 ; i < 32 ; i++) {
    accumulator += coefficients[i] *
        (input[(inputIndex + 64 - i) % 64] +
        input[(inputIndex + 1 + i) % 64]);
}

Each iteration of the for loop produces an output whose format is given by:

{16,17} * ( {16,0} + {16,0} ) = {16,17} * {17,0}
                              = {33,17}

using the rules of fixed-point arithmetic. As there are 32 iterations, it is necessary to add 6 additional bits to the size of the accumulator to prevent overflow. The six bits come from using the (MATLAB) formula:

floor(log2(32)) + 1

as per this document. According to my reasoning, this should result in an output of format {39,17}. Why then does MATLAB report the filter output size as {34,17}? Furthermore, if I want the filter output to be the same format as the input, am I correct in thinking that I need to right-shift by (in the {39,17} case) 22 bits?


Solution

  • This looks fine:

    {16,17} * ( {16,0} + {16,0} ) = {16,17} * {17,0}
                                  = {33,17}
    

    With 32 iterations, you can generate 5 additional bits (not 6), so it's {38,17}. MATLAB's output couldn't be right for all possible inputs. Is it considering particular inputs or the general case?

    The format of the input {16,0} is an integer with no fraction. So to achieve the same scale as the input, you want to merely shift the fraction out, a right shift of 15. This truncates. Consider adding 0x4000 ~= 1/2 before shifting, a form of rounding.

    If you actually want to match the input {16,0} exactly, you shift right by 22 (possibly adding 0x200000 first to round). This introduces a scale factor of 1/128 in the transfer function (giving away about -20dB of signal!). Fine if that's what the problem demands.