For example, I got the symmetric adjacency matrix(no self-loop), i.e
A =
[
[0, 1, 1, 0, 0],
[1, 0, 1, 0, 1],
[1, 1, 0, 1, 0],
[0, 0, 1, 0, 1],
[0, 1, 0, 1, 0]]
Then, I got an array of cluster label i, associated with this graph, i.e
cluster = [1,1,2,2,3]
It means that the node 1 and node 2 are in the same group, node 3 and node 4 are in the same group, node 5 are in a independent group.
The question is that how can I get the sum of the edges within groups and between groups,
Within groups: means the edge between nodes that share the same cluster label, for example, node 1 and node 2 are the in the same group, so the sum is 1 for them, the same for node 3 and node 4. For node 5, its 0.
Between groups: means the edge between nodes that share different cluster label, for example, group 1 and group 2, it means that sum the edge of, node 1 node 3, node 1 node 4, node 2 node 3, node 2 node 4. The answer is 2 between group 1 and group 2.
then return a 2d symmetric array contains the result, i.e
[[1,2,1],
[2,1,1],
[1,1,0]]
You can use matrix algebra;
cluster = np.array(cluster)
# create cluster-node adjacency matrix
aux = np.identity(cluster.max(),int)[cluster-1]
# we can now count by multiplying
out = aux.T@A@aux
# fix diagonal (which was counted twice)
np.einsum("ii->i",out)[...] //= 2
out
# array([[1, 2, 1],
# [2, 1, 1],
# [1, 1, 0]])
To speed this up we can replace the matrix product with (1) if nodes are sorted by cluster:
boundaries = np.diff(cluster,prepend=-1).nonzero()[0]
out = np.add.reduceat(np.add.reduceat(A,boundaries,1),boundaries,0)
(2) if not:
nc = cluster.max()
out = np.zeros((nc,nc),int)
np.add.at(out,(cluster[:,None]-1,cluster-1),A)