python list group-by grouping python-itertools

Group tuples inside a list by matching positions of two of its sub-elements

I have a list of tuples as below. The tuple in itself is a nested tuple with 3 sub-elements (tuples) inside it.

[(('a', 'apple'), ('b', 'mango'), ('c', 'grapes')),
 (('a', 'apple'), ('b', 'mango'), ('c', 'grapes')),
 (('e', 'apple'), ('b', 'mango'), ('c', 'grapes')),
 (('a', 'apple'), ('d', 'mango'), ('c', 'peach')),
 (('e', 'apple'), ('d', 'mango'), ('f', 'grapes')),
 (('f', 'grapes'), ('e', 'apple'), ('d', 'mango')),
 (('f', 'peach'), ('e', 'apple'), ('e', 'mango')),
 (('f', 'grapes'), ('c', 'apple'), ('d', 'mango')), 
 (('e', 'apple'), ('f', 'grapes'), ('d', 'mango')),
 (('a', 'apple'), ('c', 'grapes'), ('b', 'mango')),
 ]

I want to group these tuples by matching the positions of two of its elements viz. apple and mango (which is fixed and known beforehand) inside the tuples!

Desired output:

[
# apple and mango at positions 1 and 2.
[(('a', 'apple'), ('b', 'mango'), ('c', 'grapes')),
 (('a', 'apple'), ('b', 'mango'), ('c', 'grapes')),
 (('e', 'apple'), ('b', 'mango'), ('c', 'grapes')),
 (('a', 'apple'), ('d', 'mango'), ('c', 'peach')),
 (('e', 'apple'), ('d', 'mango'), ('f', 'grapes'))],

# apple and mango at positions 2 and 3.
 [(('f', 'grapes'), ('e', 'apple'), ('d', 'mango')),
 (('f', 'peach'), ('e', 'apple'), ('e', 'mango')),
 (('f', 'grapes'), ('c', 'apple'), ('d', 'mango'))], 

# apple and mango at positions 1 and 3.
 [(('e', 'apple'), ('f', 'grapes'), ('d', 'mango')),
 (('a', 'apple'), ('c', 'grapes'), ('b', 'mango'))]
 ]

I tried using Counter and also checked some other examples but couldn't succeed in coming close the desired output. As such, any help or pointers would be really appreciated.

Solution

My go-to solution for grouping tasks like this is collections.defaultdict. I've written a lengthy answer about grouping things, which you can read here. Picking out the relevant snippets from that answer gives us this piece of code:

import collections

groupdict = collections.defaultdict(list)
for value in your_list_of_tuples:  # input
    group = ???  # group identifier
    groupdict[group].append(value)

result = list(groupdict.values())  # output

Where all that's left is to find a way to uniquely represent each group with a hashable value (that is, we need to fill in the group = ??? line).

The easiest solution is probably to extract the apple and mango values from the nested tuples and replace all other values with None:

>>> tup = (('a', 'apple'), ('c', 'grapes'), ('b', 'mango'))
>>> tuple((t[1] if t[1] in {'apple','mango'} else None) for t in tup)
('apple', None, 'mango')

Add that in and we're done:

import collections

groupdict = collections.defaultdict(list)
for value in your_list_of_tuples:
    group = tuple((t[1] if t[1] in {'apple','mango'} else None) for t in value)
    groupdict[group].append(value)

result = list(groupdict.values())

# result:
# [[(('a', 'apple'), ('b', 'mango'), ('c', 'grapes')),
#   (('a', 'apple'), ('b', 'mango'), ('c', 'grapes')),
#   (('e', 'apple'), ('b', 'mango'), ('c', 'grapes')),
#   (('a', 'apple'), ('d', 'mango'), ('c', 'peach')),
#   (('e', 'apple'), ('d', 'mango'), ('f', 'grapes'))],
#  [(('f', 'grapes'), ('e', 'apple'), ('d', 'mango')),
#   (('f', 'peach'), ('e', 'apple'), ('e', 'mango')),
#   (('f', 'grapes'), ('c', 'apple'), ('d', 'mango'))],
#  [(('e', 'apple'), ('f', 'grapes'), ('d', 'mango')),
#   (('a', 'apple'),('c', 'grapes'), ('b', 'mango'))]]