Search code examples
rigraphsimilarity

Maximum jaccard similarity in igraph


From the igraph documentation: "The Jaccard similarity coefficient of two vertices is the number of common neighbors divided by the number of vertices that are neighbors of at least one of the two vertices being considered." Therefore, it should achieve its maximum value of 1 when two vertices have identical neighbors. Moreover, in a complete graph where all vertices have identical neighbors (i.e. all other vertices), all Jaccard similarities should be 1. However:

library(igraph)
G <- erdos.renyi.game(5, 1, "gnp")  #A complete graph
similarity.jaccard(G, loops = FALSE)
     [,1] [,2] [,3] [,4] [,5]
[1,]  1.0  0.6  0.6  0.6  0.6
[2,]  0.6  1.0  0.6  0.6  0.6
[3,]  0.6  0.6  1.0  0.6  0.6
[4,]  0.6  0.6  0.6  1.0  0.6
[5,]  0.6  0.6  0.6  0.6  1.0

What am I missing here? Why are the Jaccard similarities of these vertices, which each have identical neighbors, 0.6 and not 1?


Solution

  • The key here is the loops argument

    loops: Whether to include vertices themselves in the neighbor sets.

    IOW, loop = TRUE has each vertex pointing to itself as a neighbor, and, in that case, the Jaccard similarity is everywhere 1 for the graph in the question.

    For loop = FALSE: Each pair of vertices has 3 common neighbors (excluding themselves) and 5 vertices that are neighbors of at least one of them. Take, for example, vertex 1 and vertex 2: vertex 1 has {2, 3, 4, 5} as neighbors, and vertex 2 has {1, 3, 4, 5} as neighbors. The numerator of the Jaccard similarity is the cardinality of the intersection of the two sets, |{3, 4, 5}| = 3. The denominator is the cardinality of the union of the two sets, |{1, 2, 3, 4, 5}| = 5.

    For loop = TRUE: Each pair of vertices has 5 common neighbors (including themselves) and 5 vertices that are neighbors of at least one of them. With the same example, vertex 1 and vertex 2: vertex 1 has {1, 2, 3, 4, 5} as neighbors, and vertex 2 has {1, 2, 3, 4, 5} as neighbors. The numerator of the Jaccard similarity is the cardinality of the intersection of the two sets, |{1, 2, 3, 4, 5}| = 5. The denominator is the cardinality of the union of the two sets, |{1, 2, 3, 4, 5}| = 5.