I have seen a few others posts relating to this game, but none of them was centered around the type of algorithm I've opted for, at least not in much details yet. This is also a pretense for me to learn more about graphs (such as with the igraph package). Needless to say, I don't encourage people to cheat in any situation. This is really a learning challenge I set for myself - it's often through those things I learn the most in the end.
My plan involves some prep work besides the obvious collection of French dictionary.
First big step was to construct an igraph that looks like this, illustrating the allowed connections between Boggle letters. (For those unfamiliar with Boggle, you can only create words from directly adjacent letters, including diagonally. and the longer the words, the bigger the rewards).
The next step (which might not be ideal, but couldn't figure out how to achieve this directly from the igraph Package). Anyway, it was to generate all permutations using gtools:
permutations(n=16, r=3)
permutations(n=16, r=4)
and then using the igraph::neigbourhood
function to "validate" every single permutation to see if they'd be legit on a Boggle game. We see from the numbers below that the larger the "sample" (the longer the words, if you prefer), the more permutations are rejected. So it's a lot of processing power to gain very little additional information. Clearly not optimal. And as r gets above 7, all hell breaks loose (my 8 Gb of Ram are still not enough!)
4 letter permutations - total : 43680
legit : 1764 (4.0%)
6 letter permutations - total : 5765760
legit : 22672 (0.4%)
and so forth
So now I'd like to find a way to generate those permutations in a more sensical way (maybe they could be called "paths" or "trajectories"), maybe with a tool such as igraph, so that I don't fry my motherboard for having too much fun. Working with graphs is new to me so it may be standing right in my face, but I can't see anything such as "generate all trajectories passing through N adjacent nodes on the graph" or something similar in the Docs. Maybe it exists but it referred to as "Some Guy's algorithm", guy whom I unfortunately have never heard of before.
I'm pretty happy with the results once all that prep work is through. It's reasonably fast and totally accurate. I'm just stuck with the 7-letter words (5 miserable points hehehe). I might put it on GitHub at some point if ppl are interested. I think people who know about graphs enough should be able to point me in the right direction, that's why I don't think putting any coding in lengths would serve any purpose here.
Thanks in advance!
(For sake of completeness, once the "valid permutations" are computed, I run the resulting words against the dictionary entries and set aside the ones that match. I'm using RSQLite and work with chunks of words of increasing lengths; keeping things separate in that way makes the code pretty easy to follow and also makes db searches pretty fast.)
Here's a recursive solution that finds all paths up to length L
.
Using the graph created by this Gist:
getPaths <- function(v, g, L = 4) {
paths <- list()
recurse <- function(g, v, path = NULL) {
path <- c(v, path)
if (length(path) >= L) {
return(NULL)
} else {
for (i in neighbors(g, v)) {
if (!(i %in% path)) {
paths[[length(paths) + 1]] <<- c(i, path)
recurse(g, i, path)
}
}
}
}
recurse(g, v)
return(paths)
}
allPaths <- lapply(V(g), getPaths, g)
# look at the first few paths from vertex 1:
> head(allPaths[[1]])
[[1]]
[1] 2 1
[[2]]
[1] 3 2 1
[[3]]
[1] 4 3 2 1
[[4]]
[1] 6 3 2 1
[[5]]
[1] 7 3 2 1
[[6]]
[1] 8 3 2 1
Edit
Here's a more efficient solution that only keeps the L-length paths.
getPaths <- function(v, g, L = 4) {
paths <- list()
recurse <- function(g, v, path = NULL) {
path <- c(v, path)
if (length(path) >= L) {
paths[[length(paths) + 1]] <<- rev(path)
} else {
for (i in neighbors(g, v)) {
if (!(i %in% path)) recurse(g, i, path)
}
}
}
recurse(g, v)
return(paths)
}
allPaths <- lapply(V(g), getPaths, g, 4)
L4way <- do.call(rbind, lapply(allPaths, function(x) do.call(rbind, x)))
> head(L4way)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 1 2 3 6
[3,] 1 2 3 7
[4,] 1 2 3 8
[5,] 1 2 5 6
[6,] 1 2 5 9
library(doSNOW)
library(foreach)
# this is a very parallel problem and can be parallel-ized easily
cl <- makeCluster(4)
registerDoSNOW(cl)
allPaths <- foreach(i = 3:16) %:%
foreach(v = V(g), .packages = c('igraph')) %dopar% getPaths(v, g, i)
stopCluster(cl)
path.list <- list()
for (i in seq_along(3:16)) {
path.list[[i]] <- do.call(rbind, lapply(allPaths[[i]],
function(x) do.call(rbind, x)))
}
Number of permutations for L-length words:
> data.frame(length=3:16, nPerms=sapply(path.list, nrow))
length nPerms
1 3 408
2 4 1764
3 5 6712
4 6 22672
5 7 68272
6 8 183472
7 9 436984
8 10 905776
9 11 1594648
10 12 2310264
11 13 2644520
12 14 2250192
13 15 1260672
14 16 343184
Total permutations
> sum(sapply(path.list, nrow))
[1] 12029540