Search code examples
pythonlistsetfrozenset

How is the order of elements returned by list(frozenset()) determined?


I have a following exemplary setup in Python:

a = list(frozenset(['haha', 'lol']))
b = list(frozenset(['lol', 'haha']))

Does

a == b

always return True?

Is it possible that the list of frozenset of the same elements can return False with the above setup?


Solution

  • The equivalence semantics between (frozen)sets are that if they contain equivalent items they are equivalent. Sets do not have order.

    But, since lists have order, the cast to list can potentially cause them to not be equivalent (depends on the order of iteration over the (frozen)set - which is an implementation detail).

    Here is an example of a collision in the internal implementation that causes a different iteration order due to the instertion order (this is in CPython implementation):

    >>> a = list(frozenset([1, 9]))
    >>> b = list(frozenset([9, 1]))
    >>> a == b
    False
    

    Example with strings:

    First we need to find a collision (I will not go into details):

    >>> hash('1') % 8
    0
    >>> hash('2') % 8
    5
    >>> hash('3') % 8
    2
    >>> hash('4') % 8
    3
    >>> hash('5') % 8
    1
    >>> hash('6') % 8
    4
    >>> hash('7') % 8 
    5  # same as '2' ! Found!
    

    Now we need to add in different order to the set to cause a re-hash (again, not going into details):

    >>> s1, s2 = '2', '7'
    >>> a = list(frozenset([s1, s2]))
    >>> b = list(frozenset([s2, s1]))
    >>> a == b
    False