Search code examples
pythonarraysnumpyproductcustomization

Customized operation on two NumPy arrays with standard scalar product structure


Is there a way to exploit the standard scalar product structure between two arrays in a customized way?

To make it more understandable, I would like to use this type of operation:

arr1 = array([a1, b1])
arr2 = array([a2, b2])
scalar_product = arr1@arr2

-> where scalar_product is equal to: a1 * a2 + b1 * b2

But instead of '*' and '+' between elements, I would like to use some of my own special methods of addition and multiplication, so it would be something like:

some_special_scalar_product(arr1, arr2) = my_sum(my_mult(a1, a2), my_mult(b1, b2))

Extra information:

  • The actual inputs of the arrays are strings, and it has to stay strings (for further context the strings are byte representation of Galois Field elements, though it is not essential to understand what it means to answer my question. Just consider that it does make sense in my case to talk about some special sum and multiplication on strings).
  • The reason why I would like to do that is to use the efficiency of NumPy operations rater than going for some inefficient 'for loops' with my customized methods. So only solutions matching this efficiency criteria (if possible at all) should be considered.
  • If this is not possible, do you have any other suggestion to do this type of operation efficiently? (best ways of storing the strings and accessing them in this situation, etc...)

For full details, I have those classes and functions (last part of 'encode()' method is—as I indicate in comments—where I would like to use the customized dot product):

class BinaryDomains():

    def add(self, x, y):
        return format(int(x, base=2)^int(y, base=2), '08b')  # Use bitwise XOR operator

    def multiply(self, x, y):
        prod = 0
        P_ = 0b10100110                  # P_ = irreducible polynomial without last bit (stay on 8 bits)

        #string -> byte:
        x = int(x, base=2)
        y = int(y, base=2)

        while y != 0:
            if y & 1:                    # y has a d°0:
                prod ^= x                # Add with XOR

            if x & 0b10000000:           # x is d°7:
                x = ((x ^ P_) << 1) ^ 1  # Reduce with XOR, increase degree, turn last bit into 1 (missing part of P_)
            else:
                x <<= 1                  # Increase degree

            y >>= 1                      # Get rid of y0 coefficient, decrease degree

        return format(prod, '08b')

class ReedSolomon():

    def __init__(self, k, n, x):
        """
        Args:
            k (int): dimension of message to transmit
            n (int): size of the bloc to transmit
            x (liste de string de taille n): points xi
        """
        self.f = BinaryDomains()
        self.k = k
        self.n = n
        self.x = x

    def encoding(self, message_original):
        bd = self.f
        lst = []

        for i in range(self.n):          # Put xi's powers in a list
            x = self.x[i]
            xpow = x
            lst.append([x])
            for j in range(2, self.k):
                x = bd.multiply(xpow, x)
                lst[i].append(x)

        A = []

        # **** This is where I would like to use some
        #      customised form of dot product:
        #
        for i in range(self.n): # Encode single input with xi's powers -> multiple coded outputs
            A.append(message_original[0])
            for j in range(1, self.k):
                A[i] = bd.add(A[i],
                              bd.multiply(message_original[j], lst[i][j-1]))

        return A

Once again, I can't change type of method inputs/outputs (even though it seems silly) and can't import anything else than NumPy.


Solution

  • The following is based on companion matrices and some clever linear algebra. I believe it works correctly, but I will leave it to you to find some examples to error check my method.

    import numpy as np
    
    class GF8(object):
    
        def __init__(self):
            poly_coeff = np.array([1,0,1,1,0,0,1,0],dtype = int)
            C = np.block([[np.zeros(7,dtype = int),poly_coeff[0]],[np.eye(7,dtype = int),poly_coeff[1:,None]]])
            self.C = C
            current = np.eye(8,dtype = int)
            _pows = [current]
            for _ in range(7):
                current = current@C%2
                _pows.append(current)
    
            self.pows = np.stack(_pows)
    
        def to_mat(self,x):
            return np.sum([M for i,M in enumerate(self.pows) if (1<<i)&x],axis=0)%2
    
        def to_vec(self,x):
            return np.array([int(bool((1<<i)&x)) for i in range(8)])
    
        def row_rep(self,arr):
            return np.block([self.to_mat(x) for x in arr])
    
        def col_rep(self,arr):
            return np.hstack([self.to_vec(x) for x in arr])
    
        def dot(self,arr_1, arr_2):
            vec_result = self.row_rep(arr_1)@self.col_rep(arr_2)%2
            return sum(b<<i for b,i in enumerate(vec_result))
    

    The following test script

    field = GF8()
    ans = field.dot([0b00000111,0b00000110,0b00000101],[0b10000001,0b10000000,0b01111111])
    print(format(ans,'08b'))
    

    yields the printed answer 00100101.