Search code examples
pythonnumpyvectorizationcombinationscomputational-geometry

find all possible combinations of elements of an array. NOT PRODUCT


Ive looked into this question and people keep advising to use np.meshgrid() to find all possible combinations of an array. but the problem is np.meshgrid() does not produce combinations it produces products (similar to itertools.product())

in a combination, elements are non-repeating and unordered

arr = np.arange(5)
r = 3

These are what combinations look like

np.array(
    list(itertools.combinations(arr, r))
)

>>>   [[0, 1, 2],
       [0, 1, 3],
       [0, 1, 4],
       [0, 2, 3],
       [0, 2, 4],
       [0, 3, 4],
       [1, 2, 3],
       [1, 2, 4],
       [1, 3, 4],
       [2, 3, 4]]

following are not combinations

np.array(
    list(itertools.product(arr, arr, arr))
)

>>>   [[0, 0, 0],
       [0, 0, 1],
       [0, 0, 2],
       [0, 0, 3],
       [0, 0, 4],
       [0, 1, 0],
       [0, 1, 1],
       [0, 1, 2],
       ....,
       [4, 3, 2],
       [4, 3, 3],
       [4, 3, 4],
       [4, 4, 0],
       [4, 4, 1],
       [4, 4, 2],
       [4, 4, 3],
       [4, 4, 4]])
np.array(
    np.meshgrid(arr, arr, arr)
).transpose([2, 1, 3, 0]).reshape(-1, r)

>>>   [[0, 0, 0],
       [0, 0, 1],
       [0, 0, 2],
       [0, 0, 3],
       [0, 0, 4],
       [0, 1, 0],
       [0, 1, 1],
       [0, 1, 2],
       ....,
       [4, 3, 2],
       [4, 3, 3],
       [4, 3, 4],
       [4, 4, 0],
       [4, 4, 1],
       [4, 4, 2],
       [4, 4, 3],
       [4, 4, 4]])

for r = 2 i've found a neat way to find combinations

np.array(
    np.triu_indices(len(arr), 1)
).T

>>>   [[0, 1],
       [0, 2],
       [0, 3],
       [0, 4],
       [1, 2],
       [1, 3],
       [1, 4],
       [2, 3],
       [2, 4],
       [3, 4]]

but im having a hard time finding any vectorized methods for r > 2

NOTE: even if my array is not [0, 1, 2, 3, 4] i could use the above answers as indices.

if it helps to imagine,

for r = 2 the required answer is indices of top-right triangle of a square matrix of size len(arr), ignoring the diagonal.

for r = 3 the required answer is indices of top-right-upper tetrahedron (middle one in the image) of a 3d array of size (you guessed it) len(arr), ignoring the 3d equivalent of diagonal.

enter image description here


Solution

  • This is similar to your idea for 3-D:

    n = 5
    r = 3
    a = np.argwhere(np.triu(np.ones((n,)*r),1))
    a[a[:,0]<a[:,1]]
    

    output:

    [[0 1 2]
     [0 1 3]
     [0 1 4]
     [0 2 3]
     [0 2 4]
     [0 3 4]
     [1 2 3]
     [1 2 4]
     [1 3 4]
     [2 3 4]]
    

    for 4-D (and so on) you can expand like this (Not sure of the performance):

    n = 5
    r = 4
    a = np.argwhere(np.triu(np.ones((n,)*r),1))
    a[(a[:,0]<a[:,1]) & (a[:,1]<a[:,2])]
    

    output:

    [[0 1 2 3]
     [0 1 2 4]
     [0 1 3 4]
     [0 2 3 4]
     [1 2 3 4]]
    

    Itertools seems to be faster if that is what you are aiming at:

    def m1(n):
      r = 3
      a = np.argwhere(np.triu(np.ones((n,)*r),1))
      return a[a[:,0]<a[:,1]]
    
    def m2(n):
      r = 3
      return combinations(np.arange(n),r)
    
    in_ = [5,10,100,200]
    

    enter image description here