Search code examples
pythonaescythonsimdcythonize

AES-NI intrinsics in Cython?


Is there a way to use AES-NI instructions within Cython code?

Closest I could find is how someone accessed SIMD instructions: https://groups.google.com/forum/#!msg/cython-users/nTnyI7A6sMc/a6_GnOOsLuQJ

AES-NI in Python thread was not answered: Python support for AES-NI


Solution

  • You should be able to just define the intrinsics as if they're normal C functions in Cython. Something like

    cdef extern from "emmintrin.h": # I'm going off the microsoft documentation for where the headers are
        # define the datatype as an opaque type
        ctypedef struct __m128i:
            pass
    
        __m128i _mm_set_epi32 (int i3, int i2, int i1, int i0)
    
    cdef extern from "wmmintrin.h":
        __m128i _mm_aesdec_si128(__m128i v,__m128i rkey)
    
    # then in some Cython function
    def f():
       cdef __m128i v = _mm_set_epi32(1,2,3,4)
       cdef __m128i key = _mm_set_epi32(5,6,7,8)
       cdef __m128i result = _mm_aesdec_si128(v,key)
    

    The question "how do I apply this over a bytes array"? First, you get a char* of the bytes array. Then just iterate over it with range (being careful not to run off the end).

    # assuming you already have an __m128i key
    cdef __m128i v
    cdef char* array = python_bytes_array # auto conversion
    cdef int i, j
    
    # you NEED to ensure that the byte array has a length divisible by
    # 16, otherwise you'll probably get a segmentation fault.
    for i in range(0,len(python_bytes_array),16):
        # go over in chunks of 16
        v = _mm_set_epi8(array[i+15],array[i+14],array[i+13],
                # etc... fill in the rest 
                array[i+1], array[i])
    
        cdef __m128 result = _mm_aesdec_si128(v,key)
    
        # write back to the same place?
        for j in range(16):
            array[i+j] = _mm_extract_epi8(result,j)