Search code examples
pythondictionaryfunctional-programmingsetfold

python reduce to find the union of sets


I am trying to find the union of set of sets. Specifically I want the union of the list of nodes for each key in the dictionary of networkx graphs called periodic_gs. I would like to use the reduce function as it seems reasonable to take the union of all periodic_gs[x].nodes() where x is a key of periodic_gs.

Here is my attempt:

reduce(lambda x,y: set(periodic_gs[x].nodes()).union(set(periodic_gs[y].nodes())), periodic_gs.keys(), {})

To me, this says take the union of the nodes across each graph in the dictionary. For some reason, python tells me: TypeError: unhashable type: 'dict' I do not see this TypeError, because periodic_gs.keys() is a list of the keys (they are strings but I do not see how this would matter), and when substituted in for the arguments to the lambda function will work.

What is causing the type error and how do I fix it?


Solution

  • You can use set.union like this:

    >>> lis = [{1, 2, 3, 4}, {3, 4, 5}, {7, 3, 6}]
    >>> set().union(*lis)
    set([1, 2, 3, 4, 5, 6, 7])
    

    It's possible to do this using reduce, but don't:

    >>> reduce(set.union, lis)
    set([1, 2, 3, 4, 5, 6, 7])
    

    because this reduce takes quadratic time due to all the intermediate sets it builds and discards:

    In [1]: from functools import reduce
    
    In [2]: sets = [{x} for x in range(1000)]
    
    In [3]: %timeit set().union(*sets)
    40.9 µs ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
    In [4]: %timeit reduce(set.union, sets)
    4.09 ms ± 587 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    That's a 100x slowdown on this test case, and it can easily be even worse.

    For your code, this should do it:

    set().union(*(x.nodes() for x in periodic_gs.values()))