Search code examples
rmatrixigraphrandom-walk

More efficient way to run a random walk on a transition matrix than the igraph::random_walk()


I'm trying to create a random walker on a specific transition matrix (20,000 * 20,000) and so far I'm using the igraph::random_walk() function from R's package igraph.

The thing with that function is that gets as input a graph and not the transition matrix. That means that you firstly have to convert your transition matrix into a graph, using the following command:

# Transform transition matrix into graph
g <- igraph::graph.adjacency( as.matrix(tm), mode = "directed", weighted = TRUE )

Since my transition matrix is a 20,000*20,000 matrix, the variable tm occupies around 3.1GB and the corresponding graph g occupies 13.3GB. The disadvantage of this approach is that the script full up the whole memory (32GB RAM system) and sometimes kernel (probably) kills the process.

So I was wondering if there is any other package (couldn't find anything) in R that returns a random walk on the transition matrix, without the need for conversion into a graph firstly.


Solution

  • What about implementing it manually?

    library(igraph)
    set.seed(1)
    resample <- function(x, ...) x[sample.int(length(x), ...)]
    n <- 1000
    tm <- matrix(sample(0:1, n^2, prob = c(0.95, 0.05), replace = TRUE), n, n)
    tm <- (tm == 1 | t(tm) == 1) * 1
    diag(tm) <- 0
    
    start <- 23 # Random walk starting vertex
    len <- 10 # Walk length
    path <- c(start, rep(NA, len))
    for(i in 2:(len + 1)) {
      idx <- tm[path[i - 1], ] != 0
      if(any(idx)) {
        path[i] <- resample(which(idx), 1, prob = tm[path[i - 1], idx])
      } else {
        break # Stopping if we get stuck
      }
    }
    path
    #  [1]   23 3434 4908 4600  332 4266 1752 1845 4847 4817 1992