Search code examples
pythonrecommendation-engine

A simple method of mathematical formula representation for jaccard


There are some doubts about using Jaccard formula, such as Wikipedia's explanation. The following should be

jaccard_score([0 1 1], [1 1 1])
0.6666666666666666

According to the Jaccard formula, the calculation method is intersection / (size of A + size of B - intersection). So, it should be 2/(3+3-2) = 2/4 = 0.5.

But obviously this is not the case. Can anyone give a direct formula calculation method to get this 0.66666666?


Solution

  • You are interpreting the formula incorrectly; "size of A" refers to the cardinality of the set. The data [0, 1, 1] is a boolean representation of a set A in which the universe consists of three elements, and the value '1' indicates which element (from the universe) is in the set A. So the "size of A", where A is represented by [0, 1, 1], is 2.