Search code examples
rplotdata-visualizationlatticenetwork-analysis

What is the most efficient method to plot large Adjacency Matrices (not Graphs)?


I am searching an efficient way to plot large Adjacency Matrices (>1M nodes) in R. The aim is not to plot them as a Network Graph but to plot the Adjacency Matrix itself.

This is quite a common visualization in the field of network analysis and especially community detection but I can't seem to find a way to achieve such a plot in R. The closest that I can find is the function levelplot() from lattice but it doesn't seem to be able to handle matrices of this size.

Below is an example of the desired result from a paper by Reichardt & Bornholdt.
Adjacency matrices can be downloaded from the Stanford Large Network Dataset Collection.

enter image description here


Solution

  • My first bet was to look at the image method from the Matrix package. An example could be something like:

    #####
    # simulate a random matrix
    n <- 1000000L # number of nodes
    set.seed(1)
    rng_i <- sample.int(n, size = 100L * n, replace = TRUE)
    rng_j <- sample.int(n, size = 100L * n, replace = TRUE)
    i <- c(1:n, rng_i, rng_j)
    j <- c(1:n, rng_j, rng_i)
    x <- runif(n * 100L)
    x <- c(rep(1, n), x, x)
    
    keep <- j <= i & c(rep(TRUE, n), tail(i, -n) != tail(j, -n))
    j <- j[keep]
    i <- i[keep]
    x <- x[keep]
    
    # use the image method from Matrix
    library(Matrix)
    mat <- sparseMatrix(i = i, j = j, x = x, symmetric = TRUE)
    image(mat)
    

    This takes a while but turns out to be based on levelplot from lattice. The output is a big blank plot.

    Thinking a bit about, if you have say a 102 x 102 mm picture (~ 4 x 4 inch) and a 1m x 1m matrix, then there is ~ .0001 x .0001 mm per entry of the matrix if we disregard the axis etc. Not knowing much about human perception or images, I am guessing that you would need a very large number of pixels per inch in order to plot these and I am not sure if this will be perceivable unless there are larger clusters of adjacent non-zero entries.

    On the other hand, if you change n <- 10000L then you get:

    enter image description here

    fairly fast. The above also gives an indication as to how hard it will be perceive boxes that are 1/100 x 1/100 smaller. I am guessing one would have to look for the number of adjacent non-zero nodes in a larger neighborhood but I am not aware of a package that will do that.