Search code examples
pythonpython-2.7setunique

Getting unique tuples from a list


I have a list of tuples whose elements are like this:

aa = [('a', 'b'), ('c', 'd'), ('b', 'a')] 

I want to treat ('a', 'b') and ('b', 'a') as the same group and want to extract only unique tuples. So the output should be like this:

[('a', 'b'), ('c', 'd')]

How can I achieve this efficiently as my list consists of millions of such tuples?


Solution

  • Convert to a frozenset, hash, and retrieve:

    In [193]: map(tuple, set(map(frozenset, aa))) # python2
    Out[193]: [('d', 'c'), ('a', 'b')]
    

    Here's a slightly more readable version with a list comprehension:

    In [194]: [tuple(x) for x in set(map(frozenset, aa))]
    Out[194]: [('d', 'c'), ('a', 'b')]
    

    Do note that, for your particular use case, a list of tuples isn't the best choice of data structure. Consider storing your data as a set to begin with?

    In [477]: set(map(frozenset, aa))
    Out[477]: {frozenset({'a', 'b'}), frozenset({'c', 'd'})}