I am trying to define a similarity metric of my own inspired by jaccard similarity score. Only extra thing I wanted in jaccard metric is if considered frequency of the label too. For that purpose I have written this code snippet:
u = [12,0,3]
v = [24,6,1]
num = 0
den = 0
for i in range(3):
if u[i]!=0 and v[i] != 0:
num+=(u[i]+v[i])
den+=(u[i]+v[i])
print(1 - num/den)
So my question is
An approach with numpy's vectorized function:
arr = np.array([u,v])
s = arr.sum(0)
(s*(arr==0).any(0)).sum()/s.sum()
Output:
0.13043478260869565