Search code examples
rsimilarity

Jaccard Coefficient in R


I have some trouble understanding what seems to be pretty easy calculation.

I understand Jaccard coefficient is intersection(A,B)/union(A,B), so how come this is true?

> sets::gset_similarity(c("1","2"), c("1","2","3"), "Jaccard")
[1] 1

Isn't it 2/3?


Solution

  • It works if you pass it actual set data (see ?sets::set):

    gset_similarity(set("1","2"), set("1","2","3"), "Jaccard") 
    #[1] 0.6666667
    

    Or

    gset_similarity(as.set(c("1","2")), as.set(c("1","2","3")), "Jaccard")
    #[1] 0.6666667
    

    ...if you have existing vectors and need to convert them.