Search code examples
rgraphpermutationigraphadjacency-matrix

Boggle cheat... erm... solutioning with graphs in R


I have seen a few others posts relating to this game, but none of them was centered around the type of algorithm I've opted for, at least not in much details yet. This is also a pretense for me to learn more about graphs (such as with the igraph package). Needless to say, I don't encourage people to cheat in any situation. This is really a learning challenge I set for myself - it's often through those things I learn the most in the end.

My plan involves some prep work besides the obvious collection of French dictionary.

First big step was to construct an igraph that looks like this, illustrating the allowed connections between Boggle letters. (For those unfamiliar with Boggle, you can only create words from directly adjacent letters, including diagonally. and the longer the words, the bigger the rewards).

igraph built from graph.lattice, adding diagonals manually

The next step (which might not be ideal, but couldn't figure out how to achieve this directly from the igraph Package). Anyway, it was to generate all permutations using gtools:

permutations(n=16, r=3) permutations(n=16, r=4)

and then using the igraph::neigbourhood function to "validate" every single permutation to see if they'd be legit on a Boggle game. We see from the numbers below that the larger the "sample" (the longer the words, if you prefer), the more permutations are rejected. So it's a lot of processing power to gain very little additional information. Clearly not optimal. And as r gets above 7, all hell breaks loose (my 8 Gb of Ram are still not enough!)

4 letter permutations - total : 43680 
                        legit : 1764 (4.0%)
6 letter permutations - total : 5765760 
                        legit : 22672 (0.4%) 
and so forth

So now I'd like to find a way to generate those permutations in a more sensical way (maybe they could be called "paths" or "trajectories"), maybe with a tool such as igraph, so that I don't fry my motherboard for having too much fun. Working with graphs is new to me so it may be standing right in my face, but I can't see anything such as "generate all trajectories passing through N adjacent nodes on the graph" or something similar in the Docs. Maybe it exists but it referred to as "Some Guy's algorithm", guy whom I unfortunately have never heard of before.

I'm pretty happy with the results once all that prep work is through. It's reasonably fast and totally accurate. I'm just stuck with the 7-letter words (5 miserable points hehehe). I might put it on GitHub at some point if ppl are interested. I think people who know about graphs enough should be able to point me in the right direction, that's why I don't think putting any coding in lengths would serve any purpose here.

Thanks in advance!

(For sake of completeness, once the "valid permutations" are computed, I run the resulting words against the dictionary entries and set aside the ones that match. I'm using RSQLite and work with chunks of words of increasing lengths; keeping things separate in that way makes the code pretty easy to follow and also makes db searches pretty fast.)


Solution

  • Here's a recursive solution that finds all paths up to length L.

    Using the graph created by this Gist:

    getPaths <- function(v, g, L = 4) {
      paths <- list()
      recurse <- function(g, v, path = NULL) {
        path <- c(v, path)
    
        if (length(path) >= L) {
          return(NULL)
        } else {    
          for (i in neighbors(g, v)) {
            if (!(i %in% path)) {
              paths[[length(paths) + 1]] <<- c(i, path)
              recurse(g, i, path)
            }
          }
        }
      }
      recurse(g, v)
      return(paths)
    }
    
    allPaths <- lapply(V(g), getPaths, g)
    
    # look at the first few paths from vertex 1:
    > head(allPaths[[1]])
    [[1]]
    [1] 2 1
    
    [[2]]
    [1] 3 2 1
    
    [[3]]
    [1] 4 3 2 1
    
    [[4]]
    [1] 6 3 2 1
    
    [[5]]
    [1] 7 3 2 1
    
    [[6]]
    [1] 8 3 2 1
    

    Edit

    Here's a more efficient solution that only keeps the L-length paths.

    getPaths <- function(v, g, L = 4) {
      paths <- list()
    
      recurse <- function(g, v, path = NULL) {
        path <- c(v, path)
    
        if (length(path) >= L) {
          paths[[length(paths) + 1]] <<- rev(path)      
        } else {    
          for (i in neighbors(g, v)) {
            if (!(i %in% path)) recurse(g, i, path)
          }
        }
      }
      recurse(g, v)
      return(paths)
    }
    
    allPaths <- lapply(V(g), getPaths, g, 4)
    
    L4way <- do.call(rbind, lapply(allPaths, function(x) do.call(rbind, x)))
    
    > head(L4way)
         [,1] [,2] [,3] [,4]
    [1,]    1    2    3    4
    [2,]    1    2    3    6
    [3,]    1    2    3    7
    [4,]    1    2    3    8
    [5,]    1    2    5    6
    [6,]    1    2    5    9
    

    Edit #2:

    library(doSNOW)
    library(foreach)
    
    # this is a very parallel problem and can be parallel-ized easily
    cl <- makeCluster(4)
    registerDoSNOW(cl)
    
    allPaths <- foreach(i = 3:16) %:%
      foreach(v = V(g), .packages = c('igraph')) %dopar% getPaths(v, g, i)
    
    stopCluster(cl)
    
    path.list <- list()
    for (i in seq_along(3:16)) {
      path.list[[i]] <- do.call(rbind, lapply(allPaths[[i]],
          function(x) do.call(rbind, x)))
    }
    

    Number of permutations for L-length words:

    > data.frame(length=3:16, nPerms=sapply(path.list, nrow))
       length  nPerms
    1       3     408
    2       4    1764
    3       5    6712
    4       6   22672
    5       7   68272
    6       8  183472
    7       9  436984
    8      10  905776
    9      11 1594648
    10     12 2310264
    11     13 2644520
    12     14 2250192
    13     15 1260672
    14     16  343184
    

    Total permutations

    > sum(sapply(path.list, nrow))
    [1] 12029540