Search code examples
pythonnumpypermutation

Speed up Multiset Permutations


I'm looking to speed up my code that takes ~80 milliseconds for 300 sets to generate multiset_permutations from sympy. Ideally this would take only a few milliseconds; also the more items, the slower it gets.

What can I do to make my code faster? Multi-threading? Or convert to C? Any help here on speeding this up would be greatly appreciated.

import numpy as np
from time import monotonic
from sympy.utilities.iterables import multiset_permutations

milli_time = lambda: int(round(monotonic() * 1000))

start_time = milli_time()

num_indices = 5
num_items = 300
indices = np.array([list(multiset_permutations(list(range(num_indices)))) for _ in range(num_items)])

print(indices)    

[[[0 1 2 3 4]
  [0 1 2 4 3]
  [0 1 3 2 4]
  ...
  [4 3 1 2 0]
  [4 3 2 0 1]
  [4 3 2 1 0]]

 [[0 1 2 3 4]
  [0 1 2 4 3]
  [0 1 3 2 4]
  ...
  [4 3 1 2 0]
  [4 3 2 0 1]
  [4 3 2 1 0]]

 [[0 1 2 3 4]
  [0 1 2 4 3]
  [0 1 3 2 4]
  ...
  [4 3 1 2 0]
  [4 3 2 0 1]
  [4 3 2 1 0]]

 ...

 [[0 1 2 3 4]
  [0 1 2 4 3]
  [0 1 3 2 4]
  ...
  [4 3 1 2 0]
  [4 3 2 0 1]
  [4 3 2 1 0]]

 [[0 1 2 3 4]
  [0 1 2 4 3]
  [0 1 3 2 4]
  ...
  [4 3 1 2 0]
  [4 3 2 0 1]
  [4 3 2 1 0]]

 [[0 1 2 3 4]
  [0 1 2 4 3]
  [0 1 3 2 4]
  ...
  [4 3 1 2 0]
  [4 3 2 0 1]
  [4 3 2 1 0]]]

print('Multiset Perms:', milli_time() - start_time, 'milliseconds')

Multiset Perms: 88 milliseconds

** Code Update to Reduce extra computations by 2/3 **

import itertools
import numpy as np
from time import time, monotonic
from sympy.utilities.iterables import multiset_permutations

milli_time = lambda: int(round(monotonic() * 1000))

start_time = milli_time()

num_colors = 5
color_range = list(range(num_colors))
total_media = 300

def all_perms(elements):
    if len(elements) <= 1:
        yield elements  # Only permutation possible = no permutation
    else:
        # Iteration over the first element in the result permutation:
        for (index, first_elmt) in enumerate(elements):
            other_elmts = elements[:index]+elements[index+1:]
            for permutation in all_perms(other_elmts): 
                yield [first_elmt] + permutation

multiset = list(multiset_permutations(color_range))
# multiset = list(itertools.permutations(color_range))
# multiset = list(all_perms(color_range))

_range = range(total_media)
perm_indices = np.array([multiset for _ in _range])

print('Multiset Perms:', milli_time() - start_time)

Multiset Perms: 34 milliseconds

Solution

  • First of all, you do not need to recompute the permutations.

    Moreover, np.array([multiset for _ in _range]) is expensive because Numpy have to transform multiset total_media times. You can solve that using np.array([multiset]).repeat(total_media, axis=0).

    Finally, sympy is not the fastest implementation to perform such a computation. A faster implementation consists in using itertools instead:

    num_colors = 5
    total_media = 300
    color_range = list(range(num_colors))
    multiset = list(set(itertools.permutations(color_range)))
    perm_indices = np.array([multiset], dtype=np.int32).repeat(total_media, axis=0)
    

    However, this itertools-based implementation do not preserve the order of the permutations. If this is important, you can use np.sort on the Numpy array converted from multiset (with a specific axis and before applying repeat).

    On my machine, this takes about 0.15 ms.