I have a large graph(100000 nodes) and i want to find its cliques of size 5. I use this command for this goal:
cliques(graph, min=5, max=5)
It takes lots of time to calculate this operation. It seems that it first tries to find all of the maximal cliques of the graph and then pick the cliques with size 5; I guess this because of the huge difference of run time between these two commands while both of them are doing a same job:
adjacent.triangles (graph) # takes about 30s
cliques(graph, min=3, max=3) # takes more than an hour
I am looking for a command like adjacent.triangles
to find clique with size 5 efficiently.
Thanks
There is a huge difference between adjacent.triangles()
and cliques()
. adjacent.triangles()
only needs to count the triangles, while cliques()
needs to store them all. This could easily account for the time difference if there are lots of triangles. (Another factor is that the algorithm in cliques()
has be generic and not limited to triangles - it could be the case that adjacent.triangles()
contains some optimizations since we know that we are interested in triangles only).
For what it's worth, cliques()
does not find all the maximal cliques; it starts from 2-cliques (i.e. edges) and then merges them into 3-cliques, 4-cliques etc until it reaches the maximum size that you specified. But again, if you have lots of 3-cliques in your graph, this could easily become a bottleneck as there is one point in the algorithm where all the 3-cliques have to be stored (even if you are not interested in them) since we need them to find the 4-cliques.
You are probably better off with maximal.cliques()
first to get a rough idea of how large the maximal cliques are in your graph. The idea here is that you have a maximal clique of size k, then all its subsets of size 5 are 5-cliques. This means that it is enough to search for the maximal cliques, keep the ones that are at least size 5, and then enumerate all their subsets of size 5. But then you have a different problem as some cliques may be counted more than once.
Update: I have checked the source code for adjacent.triangles
and basically all it does is that it loops over all vertices, and for each vertex v it enumerates all the (u, w) pairs of its neighbors, and checks whether u and w are connected. If so, there is a triangle adjacent on vertex v. This is an O(nd2) operation if you have n vertices and the average degree is d, but it does not generalize to groups of vertices of arbitrary size (because you would need to hardcode k-1 nested for loops in the code for a group of size k).