Search code examples
pythonnumpypython-itertools

Union list of lists without duplicates


I have got list of lists. I need to get all combinations of that lists from 2 of N to N of N. I'm searching for it with itertools.combinations. After this I got list of lists and I need to combine them without duplicates.

For example I have got array:

a = np.array([[1,4,7],[8,2,5],[8,1,4,6],[8,1,3,5],
              [2,3,4,7],[2,5,6,7],[2,3,4,6,8],[1,3,5,6,7]])

I'm searching for all 3 elements combinations:

a2 = list(itertools.combinations(a, 3))

a2[:5]
[([1, 4, 7], [8, 2, 5], [8, 1, 4, 6]),
 ([1, 4, 7], [8, 2, 5], [8, 1, 3, 5]),
 ([1, 4, 7], [8, 2, 5], [2, 3, 4, 7]),
 ([1, 4, 7], [8, 2, 5], [2, 5, 6, 7]),
 ([1, 4, 7], [8, 2, 5], [2, 3, 4, 6, 8])]

The length of this array: 56. I need to combine every list in this array without duplicates. For exmple for a2[0] input:

([1, 4, 7], [8, 2, 5], [8, 1, 4, 6])

output:

[1, 2, 4, 5, 6, 7, 8]

And so all 56 elements. I tried to do it with set:

arr = list(itertools.combinations(a,3))
for i in arr:
    arrnew[i].append(list(set().union(arr[i][:3])))

But I had got error:

TypeError                                 Traceback (most recent call last)
<ipython-input-75-4049ddb4c0be> in <module>()
      3 arrnew = []
      4 for i in arr:
----> 5     for j in arr[i]:
      6         arrnew[i].append(list(set().union(arr[:n])))

TypeError: list indices must be integers or slices, not tuple

I need function for N combinations, that returns new combined array. But I don't know how to do this because of this error.

Is there way to solve this error or another way to solve this task?


Solution

  • A small function which solves it:

    def unique_comb(a):
        return list(set(itertools.chain(*a)))
    

    For example:

    unique_comb(([1, 4, 7], [8, 2, 5], [8, 1, 4, 6]))
    

    If you want to pass a list as an argument to the function, rather than a list inside a tuple, just remove the * (which unpacks the list).

    If you want to apply it to the entire array in one statement without defining a function:

    a3 = [list(set(itertools.chain(*row))) for row in a2]