Search code examples
algorithmpermutation

Algorithm to generate permutations by order of fewest positional changes


I'm looking for an algorithm to generate or iterate through all permutations of a list of objects such that:

  1. They are generated by fewest to least positional changes from the original. So first all the permutations with a single pair of elements swapped, then all the permutations with only two pairs of elements swapped, etc.
  2. The list generated is complete, so for n objects in a list there should be n! total, unique permutations.
  3. Ideally (but not necessarily) there should be a way of specifying (and generating) a particular permutation without having to generate the full list first and then reference the index.
  4. The speed of the algorithm is not particularly important.

I've looked through all the permutation algorithms that I can find, and none so far have met criteria 1 and 2, let alone 3.

I have an idea how I could write this algorithm myself using recursion, and filtering for duplicates to only get unique permutations. However, if there is any existing algorithm I'd much rather use something proven.


Solution

  • This code answers your requirement #3, which is to compute permutation at index N directly.

    This code relies on the following principle:

    The first permutation is the identity; then the next (n choose 2) permutations just swap two elements; then the next (n choose 3)(subfactorial(3)) permutations derange 3 elements; then the next (n choose 4)(subfactorial(4)) permutations derange 4 elements; etc. To find the Nth permutation, first figure out how many elements it deranges by finding the largest K such that sum[k = 0 ^ K] (n choose k) subfactorial(k) ⩽ N.

    This number K is found by function number_of_derangements_for_permutation_at_index in the code.

    Then, the relevant subset of indices which must be deranged is computed efficiently using more_itertools.nth_combination.

    However, I didn't have a function nth_derangement to find the relevant derangement of the deranged subset of indices. Hence the last step of the algorithm, which computes this derangement, could be optimised if there exists an efficient function to find the nth derangement of a sequence efficiently.

    As a result, this last step takes time proportional to idx_r, where idx_r is the index of the derangement, a number between 0 and factorial(k), where k is the number of elements which are deranged by the returned permutation.

    from sympy import subfactorial
    from math import comb
    from itertools import count, accumulate, pairwise, permutations
    from more_itertools import nth_combination, nth
    
    def number_of_derangements_for_permutation_at_index(n, idx):
        #n = len(seq)
        for k, (low_acc, high_acc) in enumerate(pairwise(accumulate((comb(n,k) * subfactorial(k) for k in count(2)), initial=1)), start=2):
            if low_acc <= idx < high_acc:
                return k, low_acc
    
    def is_derangement(seq, perm):
        return all(i != j for i,j in zip(seq, perm))
    
    def lift_permutation(seq, deranged, permutation):
        result = list(seq)
        for i,j in zip(deranged, permutation):
            result[i] = seq[j]
        return result
    
    # THIS FUNCTION NOT EFFICIENT
    def nth_derangement(seq, idx):
        return nth((p for p in permutations(seq) if is_derangement(seq, p)),
                          idx)
    
    def nth_permutation(seq, idx):
        if idx == 0:
            return list(seq)
        n = len(seq)
        k, acc = number_of_derangements_for_permutation_at_index(n, idx)
        idx_q, idx_r = divmod(idx - acc, subfactorial(k))
        deranged = nth_combination(range(n), k, idx_q)
        derangement = nth_derangement(deranged, idx_r)    # TODO: FIND EFFICIENT VERSION
        return lift_permutation(seq, deranged, derangement)
    

    Testing for correctness on small data:

    print( [''.join(nth_permutation('abcd', i)) for i in range(24)] )
    
    # ['abcd',
    #  'bacd', 'cbad', 'dbca', 'acbd', 'adcb', 'abdc',
    #  'bcad', 'cabd', 'bdca', 'dacb', 'cbda', 'dbac', 'acdb', 'adbc',
    #  'badc', 'bcda', 'bdac', 'cadb', 'cdab', 'cdba', 'dabc', 'dcab', 'dcba']
    

    Testing for speed on medium data:

    from math import factorial
    
    seq = 'abcdefghij'
    n = len(seq)                 # 10
    N = factorial(n) // 2        # 1814400
    
    perm = ''.join(nth_permutation(seq, N))
    
    print(perm)
    # fcjdibaehg