Search code examples
algorithmcomputer-scienceprobabilityshufflepseudocode

Correctness of Fisher-Yates Shuffle Executed Backward


According to wikipedia and also the implementation of Java standard library, shuffling https://en.wikipedia.org/wiki/Fisher–Yates_shuffle (Fisher Yates Shuffling) works like this:

Algo A:

-- To shuffle an array a of n elements (indices 0..n-1):
for i from n−1 downto 1 do
     j ← random integer such that 0 ≤ j ≤ i
     exchange a[j] and a[i]

or equivalently

Algo B:

-- To shuffle an array a of n elements (indices 0..n-1):
for i from 0 to n−2 do
     j ← random integer such that i ≤ j < n
     exchange a[i] and a[j]

My question is for the one below (Algo C):

Algo C:

-- To shuffle an array a of n elements (indices 0..n-1):
for i from 1 to n−1 do
     j ← random integer such that 0 ≤ j ≤ i
     exchange a[i] and a[j]

Algo A and Algo B are just identical. But Algo C is different from Algo A and Algo B (in fact Algo C is Algo A executed in reverse direction)

Is Algo C correct? I am very confused. I did some chi-squared test using contingency table, and it appears that this gives apparently correct uniform order.

My question is whether Algo C is correct? If correct, why is it seen almost no where? Why F-Y shuffle is everywhere presented in one same direction.


Solution

  • Focusing on why it isn't seen more (as the other answer already shows that it is correct).

    The variants are useful as optimizations in various cases:

    • Algo B is useful if you only want a few of the random elements. Think shuffling a deck and handing out a few hands, and then collecting all the cards to reshuffle. With Algo B you don't have to shuffle the hands that weren't dealt.
    • Algo A is useful because you can skip the initialization, https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_%22inside-out%22_algorithm
    • Algo C would be useful if the collection isn't known in advanced, but this seems pretty esoteric. (So you randomly shuffle 5 cards, and then get a new one, take one more step, and then you have 6 shuffled cards etc.)

    Those additional uses mean that Algo B and A will be coded more, and used even in other cases.