I’m writing a Radix-2 DIT FFT algorithm in VHDL, which requires some fractional multiplication of input data by Twiddle Factor (TF). I use Fixed Point arithmetic’s to achieve that, with every word being 16 bit long, where 1 bit is a sign bit and the rest is distributed between integer and fraction. Therefore my dilemma:
I have no idea, in what range my input data will be, so if I just decide that 4 bits go to integer and the rest 11 bits to fraction, in case I get integer numbers higher than 4 bits = 15 decimal, I’m screwed. The same applies if I do 50/50, like 7 bits to integer and the rest to fraction. If I get numbers, which are very small, I’m screwed because of truncation or rounding, i.e:
Let’s assume I have an integer "3"(0000 0011) on input and TF of "0.7071" ( 0.10110101 - 8 bit), and let’s assume, for simplicity, my data is 8 bit long, therefore:
3x0.7071 = 2.1213
3x0.7071 = 0000 0010 . 0001 1111 = 2.12109375 (for 16 bits).
Here comes the trick - I need to up/down round or truncate 16 bits to 8 bits, therefore, I get 0000 0010, i.e 2 - the error is way too high.
My questions are:
I will appreciate any ideas or information on any known methods.
You don't need to know the fixed-point format of your input. You can safely treat it as normalized -1 to 1 range or full integer-range.
The reason is that your output will have the same format as the input. Or, more likely for FFT, a known relationship like 3 bits increase, which would the output has 3 more integer bits than the input.
It is the core user's burden to know where the decimal point will end up, you have to document the change to dynamic range of course.