Search code examples
algorithmlanguage-agnosticlazy-evaluationshuffle

Lazy Shuffle Algorithms


I have a large list of elements that I want to iterate in random order. However, I cannot modify the list and I don't want to create a copy of it either, because 1) it is large and 2) it can be expected that the iteration is cancelled early.

List<T> data = ...;
Iterator<T> shuffled = shuffle(data);
while (shuffled.hasNext()) {
    T t = shuffled.next();
    if (System.console().readLine("Do you want %s?", t).startsWith("y")) {
        return t;
    }
}
System.out.println("That's all");
return t;

I am looking for an algorithm were the code above would run in O(n) (and preferably require only O(log n)space), so caching the elements that were produced earlier is not an option. I don't care if the algorithm is biased (as long as it's not obvious).

(I uses pseudo-Java in my question, but you can use other languages if you wish)

Here is the best I got so far.

Iterator<T> shuffle(final List<T> data) {
    int p = data.size();
    while ((data.size() % p) == 0) p = randomPrime();
    return new Iterator<T>() {
        final int prime = p;
        int n = 0, i = 0;
        public boolean hasNext() { return i < data.size(); }
        public T next() {
            i++; n += prime;
            return data.get(n);
        }
    }
}

Iterating all elements in O(n), constant space, but obviously biased as it can produce only data.size() permutations.


Solution

  • The easiest shuffling approaches I know of work with indices. If the List is not an ArrayList, you may end up with a very inefficient algorithm if you try to use one of the below (a LinkedList does have a get by ID, but it's O(n), so you'll end up with O(n^2) time).

    If O(n) space is fine, which I'm assuming it's not, I'd recommend the Fisher-Yates / Knuth shuffle, it's O(n) time and is easy to implement. You can optimise it so you only need to perform a single operation before being able to get the first element, but you'll need to keep track of the rest of the modified list as you go.

    My solution:

    Ok, so this is not very random at all, but I can't see a better way if you want less than O(n) space.

    It takes O(1) space and O(n) time.

    There may be a way to push it up the space usage a little and get more random results, but I haven't figured that out yet.

    It has to do with relative primes. The idea is that, given 2 relative primes a (the generator) and b, when you loop through a % b, 2a % b, 3a % b, 4a % b, ..., you will see every integer 0, 1, 2, ..., b-2, b-1, and this will also happen before seeing any integer twice. Unfortunately I don't have a link to a proof (the wikipedia link may mention or imply it, I didn't check in too much detail).

    I start off by increasing the length until we get a prime, since this implies that any other number will be a relative prime, which is a whole lot less limiting (and just skip any number greater than the original length), then generate a random number, and use this as the generator.

    I'm iterating through and printing out all the values, but it should be easy enough to modify to generate the next one given the current one.

    Note I'm skipping 1 and len-1 with my nextInt, since these will produce 1,2,3,... and ...,3,2,1 respectively, but you can include these, but probably not if the length is below a certain threshold.

    You may also want to generate a random number to multiply the generator by (mod the length) to start from.

    Java code:

    static Random gen = new Random();
    
    static void printShuffle(int len)
    {
      // get first prime >= len
      int newLen = len-1;
      boolean prime;
      do
      {
         newLen++;
         // prime check
         prime = true;
         for (int i = 2; prime && i < len; i++)
            prime &= (newLen % i != 0);
      }
      while (!prime);
    
      long val = gen.nextInt(len-3) + 2;
      long oldVal = val;
      do
      {
         if (val < len)
            System.out.println(val);
         val = (val + oldVal) % newLen;
      }
      while (oldVal != val);
    }