Search code examples
rrandomsample

How to generate random words from a list of characters?


I would like to generate a list of 50 words from the letters "a b c d". every word should have 5 characters. every word has to be different from the others in at least 2 positions (letters) using base R.

x="a,b,c,d"

and results should be list of 50 words:

l=['abcda','cdabd','bdaca'.......]

expand.grid function in R generate words with only 1 letter difference.


Solution

  • If you want to sample from a complete set of words, you can first construct such complete set as the first step (see X in the code below), and then you can reuse it for multiple times by running sample(X, 50) simply

    v <- do.call(paste0, expand.grid(rep(list(letters[1:4]), 5)))
    X <- Reduce(
      function(S, k) {
        if (all(adist(k, S) >= 2)) {
          S <- c(S, k)
        }
        S
      }, v
    )
    res <- sample(X, 50)
    

    Another option is using while loop and keeping the set of selected words updated in each iteration

    res <- paste0(sample(letters[1:4], 5, replace = TRUE), collapse = "")
    while (length(res) < 50) {
      k <- paste0(sample(letters[1:4], 5, replace = TRUE), collapse = "")
      if (all(adist(k, res) >= 2)) {
        res <- c(res, k)
      }
    }
    res