Search code examples
pythonnumpyvectorsimilarityhamming-distance

Ternary function for Hamming distance, where '2' is wildcard


Let say I have the following array of vectors x, where the possible values are 0,1,2 :

 import numpy as np
 x = np.random.randint(0,3,(10,5), dtype=np.int8)

I want to do similarity match for all vectors with Hamming Distance zero or one, where the rules for matching are :

 1. 0 == 0 and 1 == 1 i.e. hamming distance is 0
 2. 2 match both 1 and 0 i.e. hamming distance is 0
 3. otherwise Hamming distance is 1

i.e. find some arithmetic operation that will return:

0 x 0 = 0
1 x 1 = 0
0 x 1 = 1
1 x 0 = 1
0 x 2 = 0
1 x 2 = 0

And my output should be the Hamming distance between each vector (row of) x, and arbitary vector z:

z = np.random.randint(0,2,5)
np.sum(np.add(x,z) == 1, axis=1)

Solution

  • int(x+y == 1)
    

    Is there something in this question I'm missing???