Search code examples
formatverilogfixed-point

How to perform division of two Q15 values in Verilog , with out using '/' (division) Operator?


As division operation (/) is expensive in case of FPGA ? Is it possible to perform division of two Q15 format numbers(16 bit fixed point number) with basic shift operations?

Could someone help me by providing some example?

Thanks in advance!


Solution

  • Fixed-point arithmetic is just integer arithmetic with a bit of scaling thrown in. Q15 is a purely fractional format stored as a signed 16-bit integer with scale factor of 215, able to represent values in the interval [-1, 1). Clearly, division only makes sense in Q15 when the divisor's magnitude exceeds the dividend's magnitude, as otherwise the quotient's magnitude exceeds the representable range.

    Before embarking on a custom Verilog implementation of fixed-point division, you would want to check your FPGA vendor's library offerings as a fixed-point library including pipeline division is often available. There are also opens source projects that may be relevant, such as this one.

    When using integer division operators for fixed-point division, we need to adjust for the fact that the division will remove the scale factor, i.e (a * 2scale) / (b * 2scale) = (a/b), while the correct fixed-point result is (a/b * 2scale). This is easily fixed by pre-multiplying the dividend by 2scale, as in the following C implementation:

    int16_t div_q15 (int16_t dividend, int16_t divisor)
    {
         return (int16_t)(((int32_t)dividend << 15) / (int32_t)divisor);
    }
    

    Wikipedia gives a reasonable overwiew on how to implement binary division on a bit-by-bit basis using add, subtract, and shift operations. These methods are closely related to the longhand division taught in grade school. For FPGAs, the use of the non-restoring method if often preferred, as pointed out by this paper, for example:

    Nikolay Sorokin, "Implementation of high-speed fixed-point dividers on FPGA". Journal of Computer Science & Technology, Vol. 6, No. 1, April 2006, pp. 8-11.

    Here is C code that shows how the non-restoring method may be used for the division of 16-bit two's-complement operands:

    /* bit-wise non-restoring two's complement division */
    void int16_div (int16_t dividend, int16_t divisor, int16_t *quot, int16_t *rem)
    {
        const int operand_bits = (int) (sizeof (int16_t) * CHAR_BIT);
        uint16_t d = (uint16_t)divisor;
        uint16_t nd = 0 - d; /* -divisor */
        uint16_t r, q = 0; /* remainder, quotient */
        uint32_t dd = (uint32_t)d << operand_bits; /* expanded divisor */
        uint32_t pp = dividend; /* partial remainder */
        int i;
    
        for (i = operand_bits - 1; i >= 0; i--) {
            if ((int32_t)(pp ^ dd) < 0) {
                q = (q << 1) + 0;   /* record quotient bit -1 (as 0) */
                pp = (pp << 1) + dd;
            } else {
                q = (q << 1) + 1;   /* record quotient bit +1 (as 1) */
                pp = (pp << 1) - dd;
            }
        }
        /* convert quotient from digit set {-1,1} to plain two's complement */
        q = (q << 1) + 1;
    
        /* remainder is upper half of partial remainder */
        r = (uint16_t)(pp >> operand_bits);
    
        /* fix up cases where we worked past a partial remainder of zero */
        if (r == d) { /* remainder equal to divisor */
            q = q + 1;
            r = 0;
        } else if (r == nd) { /* remainder equal to -divisor */
            q = q - 1;
            r = 0;
        }
    
        /* for truncating division, remainder must have same sign as dividend */
        if (r && ((int16_t)(dividend ^ r) < 0)) {
            if ((int16_t)q < 0) {
                q = q + 1;
                r = r - d;
            } else {
                q = q - 1;
                r = r + d;
            }
        }
        *quot = (int16_t)q;
        *rem  = (int16_t)r;
    }
    

    Note that there are multiple ways of dealing with the various special cases that arise in non-restoring division. For example, one frequently sees code that detects a zero partial remainder pp and exits the loop over the quotient bits early in this case. Here I assume that an FPGA implementation would unroll the loop completely to create a pipelined implementation, in which case early termination is not helpful. Instead, a final correction is applied to those quotients that are affected by ignoring a partial remainder of zero.

    In order to create a Q15 division from the above, we have to make just a single change: incorporating the up-scaling of the dividend. Instead of:

    uint32_t pp = dividend; /* partial remainder */
    

    we now use this:

    uint32_t pp = dividend << 15; /* partial remainder; incorporate Q15 scaling */
    

    The resulting C code (sorry, I won't provide read-to-use Verilog code) including the test framework is:

    #include <stdio.h>
    #include <stdlib.h>
    #include <stdint.h>
    #include <limits.h>
    #include <math.h>
    
    /* bit-wise non-restoring two's complement division */
    void q15_div (int16_t dividend, int16_t divisor, int16_t *quot, int16_t *rem)
    {
        const int operand_bits = (int) (sizeof (int16_t) * CHAR_BIT);
        uint16_t d = (uint16_t)divisor;
        uint16_t nd = 0 - d; /* -divisor */
        uint16_t r, q = 0; /* remainder, quotient */
        uint32_t dd = (uint32_t)d << operand_bits; /* expanded divisor */
        uint32_t pp = dividend << 15; /* partial remainder, incorporate Q15 scaling */
        int i;
    
        for (i = operand_bits - 1; i >= 0; i--) {
            if ((int32_t)(pp ^ dd) < 0) {
                q = (q << 1) + 0;   /* record quotient bit -1 (as 0) */
                pp = (pp << 1) + dd;
            } else {
                q = (q << 1) + 1;   /* record quotient bit +1 (as 1) */
                pp = (pp << 1) - dd;
            }
        }
        /* convert quotient from digit set {-1,1} to plain two's complement */
        q = (q << 1) + 1;
    
        /* remainder is upper half of partial remainder */
        r = (uint16_t)(pp >> operand_bits);
    
        /* fix up cases where we worked past a partial remainder of zero */
        if (r == d) { /* remainder equal to divisor */
            q = q + 1;
            r = 0;
        } else if (r == nd) { /* remainder equal to -divisor */
            q = q - 1;
            r = 0;
        }
    
        /* for truncating division, remainder must have same sign as dividend */
        if (r && ((int16_t)(dividend ^ r) < 0)) {
            if ((int16_t)q < 0) {
                q = q + 1;
                r = r - d;
            } else {
                q = q - 1;
                r = r + d;
            }
        }
        *quot = (int16_t)q;
        *rem  = (int16_t)r;
    }
    
    int main (void)
    {
        uint16_t dividend, divisor, ref_q, res_q, res_r;
        double quot, fxscale = (1 << 15);
    
        dividend = 0;
        do {
            printf ("\r%04x", dividend);
            divisor = 1;
            do {
                quot = trunc (fxscale * (int16_t)dividend / (int16_t)divisor);
                /* Q15 can only represent numbers in [-1, 1) */
                if ((quot >= -1.0) && (quot < 1.0)) {
                    ref_q = (int16_t)((((int32_t)(int16_t)dividend) << 15) / 
                                      ((int32_t)(int16_t)divisor));
                    q15_div ((int16_t)dividend, (int16_t)divisor, 
                             (int16_t *)&res_q, (int16_t *)&res_r);
                    if (res_q != ref_q) {
                        printf ("!r dividend=%04x (%f) divisor=%04x (%f)  res=%04x (%f)  ref=%04x (%f)\n", 
                                dividend, (int16_t)dividend / fxscale, 
                                divisor, (int16_t)divisor / fxscale, 
                                res_q, (int16_t)res_q / fxscale, 
                                ref_q, (int16_t)ref_q / fxscale);
                    }
                }
                divisor++;
            } while (divisor);
            dividend++;
        } while (dividend);
    
        return EXIT_SUCCESS;
    }