Search code examples
rloopssample

R: how to sample without replacement AND without consecutive same values


I have spent over a day trying to accomplish what seems to be a very simple thing. I have to create 300 'random' sequences in which the numbers 1,2,3 and 4 all appear exactly 12 times, but the same number is never used twice 'in a row'/consecutively.

My best attempts (I guess) were:

  1. have R sample 48 items without replacement, test whether there are consecutive values with rle, then use only the sequences that do not contain consecutive values. Problem: there are almost no random sequences that meet this criterion, so it takes forever.

  2. have R create sequences without consecutive values (see code).

pop<-rep(1:4,12)
y=c()
while(length(y)!=48)
  {
  y= c(y,sample(pop,48-length(y),replace=F))
  y=y[!c(FALSE, diff(y) == 0)]
  }

Problem: this creates sequences with varying numbers of each value. I then tried to use only those sequences with exactly 12 of each value, but that only brought me back to problem 1: takes forever.

There must be some easy way to do this, right? Any help is greatly appreciated!


Solution

  • Maybe using replicate() with a repeat loop is faster. here an example with 3 sequences. Looks like this would take approx. 1490 seconds with 300 (not tested).

    set.seed(42)
    seqc <- rep(1:4, each=12)  # starting sequence
    
    system.time(
      res <- replicate(3, {
        repeat {
          seqcs <- sample(seqc, 48, replace=FALSE) 
          if (!any(diff(seqcs) == 0)) break
        }
        seqcs
      })
    )
    #  user  system elapsed 
    # 14.88    0.00   14.90 
    
    res[1:10, ]
    #       [,1] [,2] [,3]
    #  [1,]    4    2    3
    #  [2,]    1    1    4
    #  [3,]    3    2    1
    #  [4,]    1    1    4
    #  [5,]    2    3    1
    #  [6,]    4    1    2
    #  [7,]    3    4    4
    #  [8,]    2    1    1
    #  [9,]    3    4    4
    # [10,]    4    3    2