Search code examples
pythonperformancesettuplesany

Use in operator in set of tuples python


I have a problem trying to check if an element is part of a set in Python. (My set contains about 600K tuples of string.)

I'm searching for a solution that use the benefit of in operator to check if a value is element of a tuple of the set.

I've found solution like:

# S set of tuples, I'm checking if v is the second element of a tuple
any( y == v for (_, y) in S )

but this has a O(n) complexity.

The Python documentation says that the average complexity of the IN operator is O(1).

EDIT

My problem is: How to check if an element is the first/second/... element of at least one tuple of the set using the speed of in operator.


Solution

  • average complexity of the IN operator is O(1)

    That's correct for membership check in sets or any container that uses hash table for storing it's items like dictionary.

    And its completely a different operation than following in:

    for (_, y) in S
    

    That in is just a part of the for loop syntax.

    Also if you want to get the tuples that are contain a particular string you could use a list comprehension rather than the any:

    [item for item in S if my_str in item]
    

    If you want to take advantage of the set's membership checking you should have sets instead of tuples but since they're not hashable you'd not be allowed to use them within a set in that case you can use frozenset() instead.

    And in case you want to just check the existence of an item that meets a certain criteria you can go with following generator expression within any :

    any(my_str in item for item in S)
    

    After all since your set completely has the potential of being a dictionary you can create a dictionary instead of set then just check the membership with my_str in my_dict. Your dictionary would be something like: {'luca': 1, 'mario': 2 , 'franco': 3}