algorithm math random permutation inverse

How to generate a pseudo-random involution?

For generating a pseudo-random permutation, the Knuth shuffles can be used. An involution is a self-inverse permutation and I guess, I could adapt the shuffles by forbidding touching an element multiple times. However, I'm not sure whether I could do it efficiently and whether it generates every involution equiprobably.

I'm afraid, an example is needed: On a set {0,1,2}, there are 6 permutation, out of which 4 are involutions. I'm looking for an algorithm generating one of them at random with the same probability.

A correct but very inefficient algorithm would be: Use Knuth shuffle, retry if it's no involution.

Solution

Let's here use a(n) as the number of involutions on a set of size n (as OEIS does). For a given set of size n and a given element in that set, the total number of involutions on that set is a(n). That element must either be unchanged by the involution or be swapped with another element. The number of involutions that leave our element fixed is a(n-1), since those are involutions on the other elements. Therefore a uniform distribution on the involutions must have a probability of a(n-1)/a(n) of keeping that element fixed. If it is to be fixed, just leave that element alone. Otherwise, choose another element that has not yet been examined by our algorithm to swap with our element. We have just decided what happens with one or two elements in the set: keep going and decide what happens with one or two elements at a time.

To do this, we need a list of the counts of involutions for each i <= n, but that is easily done with the recursion formula

a(i) = a(i-1) + (i-1) * a(i-2)

(Note that this formula from OEIS also comes from my algorithm: the first term counts the involutions keeping the first element where it is, and the second term is for the elements that are swapped with it.) If you are working with involutions, this is probably important enough to break out into another function, precompute some smaller values, and cache the function's results for greater speed, as in this code:

# Counts of involutions (self-inverse permutations) for each size
_invo_cnts = [1, 1, 2, 4, 10, 26, 76, 232, 764, 2620, 9496, 35696, 140152]

def invo_count(n):
    """Return the number of involutions of size n and cache the result."""
    for i in range(len(_invo_cnts), n+1):
        _invo_cnts.append(_invo_cnts[i-1] + (i-1) * _invo_cnts[i-2])
    return _invo_cnts[n]

We also need a way to keep track of the elements that have not yet been decided, so we can efficiently choose one of those elements with uniform probability and/or mark an element as decided. We can keep them in a shrinking list, with a marker to the current end of the list. When we decide an element, we move the current element at the end of the list to replace the decided element then reduce the list. With that efficiency, the complexity of this algorithm is O(n), with one random number calculation for each element except perhaps the last. No better order complexity is possible.

Here is code in Python 3.5.2. The code is somewhat complicated by the indirection involved through the list of undecided elements.

from random import randrange

def randinvolution(n):
    """Return a random (uniform) involution of size n."""

    # Set up main variables:
    # -- the result so far as a list
    involution = list(range(n))
    # -- the list of indices of unseen (not yet decided) elements.
    #    unseen[0:cntunseen] are unseen/undecided elements, in any order.
    unseen = list(range(n))
    cntunseen = n

    # Make an involution, progressing one or two elements at a time
    while cntunseen > 1:  # if only one element remains, it must be fixed
        # Decide whether current element (index cntunseen-1) is fixed
        if randrange(invo_count(cntunseen)) < invo_count(cntunseen - 1):
            # Leave the current element as fixed and mark it as seen
            cntunseen -= 1
        else:
            # In involution, swap current element with another not yet seen
            idxother = randrange(cntunseen - 1)
            other = unseen[idxother]
            current = unseen[cntunseen - 1]
            involution[current], involution[other] = (
                involution[other], involution[current])
            # Mark both elements as seen by removing from start of unseen[]
            unseen[idxother] = unseen[cntunseen - 2]
            cntunseen -= 2

    return involution

I did several tests. Here is the code I used to check for validity and uniform distribution:

def isinvolution(p):
    """Flag if a permutation is an involution."""
    return all(p[p[i]] == i for i in range(len(p)))

# test the validity and uniformness of randinvolution()
n = 4
cnt = 10 ** 6
distr = {}
for j in range(cnt):
    inv = tuple(randinvolution(n))
    assert isinvolution(inv)
    distr[inv] = distr.get(inv, 0) + 1
print('In {} attempts, there were {} random involutions produced,'
    ' with the distribution...'.format(cnt, len(distr)))
for x in sorted(distr):
    print(x, str(distr[x]).rjust(2 + len(str(cnt))))

And the results were

In 1000000 attempts, there were 10 random involutions produced, with the distribution...
(0, 1, 2, 3)     99874
(0, 1, 3, 2)    100239
(0, 2, 1, 3)    100118
(0, 3, 2, 1)     99192
(1, 0, 2, 3)     99919
(1, 0, 3, 2)    100304
(2, 1, 0, 3)    100098
(2, 3, 0, 1)    100211
(3, 1, 2, 0)    100091
(3, 2, 1, 0)     99954

That looks pretty uniform to me, as do other results I checked.