Search code examples
c++carmneonintrinsics

Arm NEON and poly8_t and poly16_t


I've been looking into neon optimisation with intrinsics recently and I have come across the poly8_t and poly16_t data types. I'm then left wondering what on earth they are.

I've searched all across the net but so far have been unable to find ANY explanation of what they are.

Can anyone explain them to me?

Edit: Thanks for those answers but why, if it is just a different way of multiplying etc, does it have a totally different data type?


Solution

  • Left = regular multiplication, Right = carryless multiplication

            1 1 0 1                              1 1 0 1
         *  1 0 0 1                              1 0 0 1
       ------------        -->              --------------
         (1)1 1 0 1  <-- (1) is carry            1 1 0 1
          0 0 0 0                              0 0 0 0 
        0 0 0 0                              0 0 0 0
      1 1 0 1        +                     1 1 0 1         + GF(2) or XOR
      -------------                        ---------------
      1 1 1 0 1 0 1                        1 1 0 0 1 0 1
    

    Each 1 or 0 in the diagonally descending matrix represents partial product of one source bit from the vector '1101' and one source bit from the other vector '1001'.

    The applications of the right one are in CRC, (ECC) Error Correction Code calculations (Reed Solomon, BCH) and cryptography (elliptic curves, internals of AES).

    Illustrating the connection to polynomial multiplication, the operation above can summarized as

     1101 == x^3 + x^2 + 0 + 1;
     1001 == x^3 + 0   + 0 + 1;
    

    Regular polynomial multiplication being: p(x) * (x^3 + 1) == p(x)*x^3 + p(x) ==

     (x^3 + x^2 + 1)(x^3 + 1) == x^6+x^5+x^3 + x^3+x^2+1 
                              == 1x^6 + 1x^5 + 0x^4 + 2x^3 + 1^x2 + 0x + 1
                              == "1102101"
    

    In GF(2) each coefficient is simply calculated modulo 2, making 1100101b.

    The datatype in GF looks just like uint8_t, uint16_t or perhaps upto 128_t in respect that the datatype for GF(2^8) holds 256 unique bitpatterns. However e.g. the bitpattern '00010001' e.g. has no traditional interpretation. (It's not 17 decimal, but perhaps 123th power of "unity" modulo some other polynomial.) Multiplying this number with the same "unity" modulo the generator polynomial g(x) leads to 124th power and so on. Then the properties (identities) of the finite fields have just interesting applications -- such that one can (remotely) easily calculate what 32-bit number to append to a file to make it's 32-bit CRC to match; or one can use the properties to parallelize crc calculation, or to implement bignum multiplication with Fourier-like transform in Finite fields (Number Theoretic Transform).