Search code examples
time-complexitycrcpolynomialspayloadhamming-distance

Computation of 64 bit CRC polynomial performance


I found the following page in the web: https://users.ece.cmu.edu/~koopman/crc/crc64.html

It lists the performance of a handful of 64 bit CRC polynomials. The optimal payload for a hamming distance of 3 is listed as 18446744073709551551 bit. A polynomial providing that HD 3 payload is 0xd6c9e91aca649ad4 (Koopman notation).

On the same website there is also some basic "HDLen" C code that can compute the performance of any polynomial (https://users.ece.cmu.edu/~koopman/crc/hdlen.html). I checked that code and the HD 3 optimized loop is very simple, similar to this:

Poly_t accum = cPoly;   
Length_t len = 0;

while(accum != cTopBitSet)
{
    accum = (accum & 1) ? (accum >> 1) ^ cPoly) : (accum >> 1);
    len++;  
}

18446744073709551551 is a huge number. It is almost the full range of a 64 bit integral. Even that simple loop would run centuries on the most powerful CPU core available.

It also appears to me that this loop can not be parallelized since each iteration depends from the previous iteration.

It is claimed that payload is optimal amongst all possible 64 bit polynomials which means that all possible 64 bit polynomials would have been checked for their individual HD 3 performance. This task can be parallelized, still the huge number of candidate polynomials seems to be undoable.

I can't see a way to even compute a single (good) polynomial's (HD 3) performance. Not to mention all possible 64 bit wide polynomials.

So I wonder: How has the number been found? What kind of code or method (in contrast to the simple HDLen software) was used to find the mentioned optimal HD 3 payload?


Solution

  • It is a primitive polynomial, where it can be shown that the HD=3 length of any primitive polynomial over GF(2) is 2n-(n+1), where n is the degree of the polynomial.

    It can be shown pretty quickly whether a polynomial over a finite field is primitive or not.

    Also, it is possible to compute the CRC of a very sparse codeword of n bits in O(log n) time instead of O(n) time. Here is an example in C, demonstrating the case mentioned for the provided CRC:

    #include <stdio.h>
    #include <stdint.h>
    
    // Jones' 64-bit primitive polynomial (the constant excludes the x^64 term):
    // 1 + x^3 + x^5 + x^7 + x^8 + x^10 + x^12 + x^13 + x^16 + x^19 + x^22 + x^23 +
    // x^26 + x^28 + x^31 + x^32 + x^34 + x^36 + x^37 + x^41 + x^44 + x^46 + x^47 +
    // x^48 + x^49 + x^52 + x^55 + x^56 + x^58 + x^59 + x^61 + x^63 + x^64
    #define POLY 0xad93d23594c935a9
    #define HIGH 0x8000000000000000         // high bit set
    
    // Return polynomial a times polynomial b modulo p (POLY). a must be non-zero.
    static uint64_t multmodp(uint64_t a, uint64_t b) {
        uint64_t prod = 0;
        for (;;) {
            if (a & 1) {
                prod ^= b;
                if (a == 1)
                    break;
            }
            a >>= 1;
            b = b & HIGH ? (b << 1) ^ POLY : b << 1;
        }
        return prod;
    }
    
    // x2n_table[n] is x^2^n mod p.
    static uint64_t x2n_table[64];
    
    // Initialize x2n_table[].
    static void x2n_table_init(void) {
        uint64_t p = 2;                 // first entry is x^2^0 == x^1
        x2n_table[0] = p;
        for (size_t n = 1; n < 64; n++)
            x2n_table[n] = p = multmodp(p, p);
    }
    
    // Compute x^n modulo p. This takes O(log n) time.
    static uint64_t xtonmodp(uintmax_t n) {
        uint64_t x = 1;
        int k = 0;
        for (;;) {
            if (n & 1)
                x = multmodp(x2n_table[k], x);
            n >>= 1;
            if (n == 0)
                break;
            k++;
        }
        return x;
    }
    
    // Feed n zero bits into the CRC, taking O(log n) time.
    static uint64_t crc64zeros(uint64_t crc, uint64_t n) {
        return multmodp(xtonmodp(n), crc);
    }
    
    // Feed one one bit into the CRC.
    static uint64_t crc64one(uint64_t crc) {
        return crc & HIGH ? crc << 1 : (crc << 1) ^ POLY;
    }
    
    // Return the CRC-64 of one one bit, followed by n zero bits, followed by one
    // more one bit.
    static uint64_t crc64_one_zeros_one(uint64_t n) {
        return crc64one(crc64zeros(crc64one(0), n));
    }
    
    int main(void) {
        x2n_table_init();
        uint64_t n = -2;    // code word with 2^64 bits: a 1, 2^64-2 0's, and a 1
        printf("%llx\n", crc64_one_zeros_one(n));   // prints 0
        return 0;
    }
    

    That calculation completes in about 7.4 µs on my machine. As opposed to the bit-at-a-time calculation, which would take about 560 years on my machine.