I want to calculate the Hamming distance between vectors which are very high dimensional. A data point is a vector called as the feature. Assuming, each component f_i
as an integer, it is represented in its binary form having say j
bits. There are n = 900
feature components for each data point. The problem formulation is
The formula for Hamming distance between 2 different vectors is given in the picture below where j = number of bits
For ex let n = 10
feature components,
f = [3,4,1,4,5,6,6,7,1,14];
g = [1,3,5,6,7,8,11,3,10,2];
Each component / element of the array is represented by its 16 bit binary representation using dec2bin(f_i,l)
I tried using dist = sum((f-g).^2,2)* 1/2^l
where l= 16 bits but this does not make sense because there are 2 summations in the formula.
If I understand correctly, what you want is
sum(bitxor(f,g))/2^l
where l=16