Search code examples
pythonlistsetbig-oisinstance

Python Check If List Is a Mathematical Set Fast


What is the fastest\ most pythonic way to check if a List is a mathematical set in python?

I know the following works:

ListInstance = [1,2,3,4,5,6]
ListIsMathSet = (len(set(ListInstance)) == len(ListInstance) )

Is there a better/faster way to check this?


Solution

  • It's not usually going to be faster, but if the values aren't hashable but they are comparable, and especially if they're already sorted, you can lazily determine if any elements are non-unique:

    def is_unique(items, key=None):
        for k, g in itertools.groupby(sorted(items, key=key), key=key):
            if len(list(itertools.islice(g, 2))) > 1:
                return False
        return True
    

    This will stop as soon as the first duplicate is detected and checks no more than necessary, which may run faster (particularly in the "input already sorted" case). A similar early out based approach can be made using set by iterating as you go to minimize the number of elements hashed and stored in the case where uniqueness is violated quickly, by doing this (adapted from the unique_everseen recipe in itertools):

    def is_unique(iterable):
        seen = set()
        seen_add = seen.add
        for element in iterable:
            if element in seen:
                return False
            seen_add(element)
        return True
    

    Note: Neither of the above solutions is better in the typical case of a small number of hashable inputs where uniqueness is common (or at least, not violated early in the set of inputs). The simple solution you gave is concise, obvious, and performs most of the work at the C layer in CPython, so it has a much lower fixed overhead compared to methods that execute a lot of Python code. But they may be useful for large inputs, already sorted inputs, and/or inputs where uniqueness is uncommon (and therefore the early out behavior saves you some work).