I’m trying to compare clients, two by two, whose qualities can be defined by binary choices (for example a client uses a product or not).
After much search online, it looks like I’d need to use the Hamming Distance for that, or its equivalent: find the Hamming Weight to the result of an XOR operation between two words.
For a concrete example, Hamming distance between 1001 & 1011:
Calculate the number 1001 XOR 1011= 0010
Hamming weight of 0010 = 1 (numbers of bit set to 1 in 0010)
I need to do that in for words up to 96 bits.
I found some information on
http://people.revoledu.com/kardi/tutorial/Similarity/HammingDistance.html
http://trustedsignal.blogspot.ca/2015/06/xord-play-normalized-hamming-distance.html
and plenty of pieces of code , for example
Hamming weight written only in binary operations?
but only in C, Java , Perl, O, opencl... anything but Excel VBA.
So far, here’s what I manage to put together.
It works, but unfortunately only for words of 30 bits or less, and uses a somewhat crude method: XOR on the two numbers X and Y, then convert to a string that represent the binary number. Then count the length of the string once the 1’s are taken out. I guess there’s a more elegant and efficient way.
Public Function HamDist(x As Long, y As Long, NbBit As Integer)
Dim i As Long, BinStrg As String, bxor As Long
bxor = x Xor y
BinStrg = ""
For i = NbBit To 0 Step -1 ‘going from left to right
If bxor And (2 ^ i) Then
BinStrg = BinStrg + "1" ‘add a 1 to the string
Else
BinStrg = BinStrg + "0"
End If
Next
HamDist = Len(BinStrg) - Len(Replace(BinStrg, "1", "")) ' replace the 1 by nothing and count the length of the resulting string
End Function
Can you help make it works for 96 bit words in VBA for Excel 2010 and below (udf or sub), either by calculating the hamming weight or distance ?
If you store the chain of qualities in String form (e.g. a String consisting only of the letters 'T' and 'F'), this could quite easily be done using a loop.
Function hammingDistance(qualities1 As String, qualities2 As String) As Integer
If Len(qualities1) <> Len(qualities2) Then
hammingDistance = -1
Exit Function
End If
Dim i, result As Integer
result = 0
For i = 1 To Len(qualities1)
If Mid(qualities1, i, 1) <> Mid(qualities2, i, 1) Then result = result + 1
Next
hammingDistance = result
End Function