Search code examples
pythongeneratorlist-comprehension

Python list comprehension to create unequal length lists from a list using conditional


Using list comprehension, itertools or similar functions, is it possible to create two unequal lists from one list based on a conditional? Here is an example:

main_list = [6, 3, 4, 0, 9, 1]
part_list = [4, 5, 1, 2, 7]

in_main = []
out_main = []

for p in part_list:
    if p not in main_list:
        out_main.append(p)
    else:
        in_main.append(p)

>>> out_main
[5, 2, 7]

>>> in_main
[4, 1]

I'm trying to keep it simple, but as an example of usage, the main_list could be values from a dictionary with the part_list containing dictionary keys. I need to generate both lists at the same time.


Solution

  • A true itertools-based solution that works on an iterable:

    >>> part_iter = iter(part_list)
    >>> part_in, part_out = itertools.tee(part_iter)
    >>> in_main = (p for p in part_in if p in main_list)
    >>> out_main = (p for p in part_out if p not in main_list)
    

    Making lists out of these defeats the point of using iterators, but here is the result:

    >>> list(in_main)
    [4, 1]
    >>> list(out_main)
    [5, 2, 7]
    

    This has the advantage of lazily generating in_main and out_main from another lazily generated sequence. The only catch is that if you iterate through one before the other, tee has to cache a bunch of data until it's used by the other iterator. So this is really only useful if you iterate through them both at roughly the same time. Otherwise you might as well use auxiliary storage yourself.

    There's also an interesting ternary operator-based solution. (You could squish this into a list comprehension, but that would be wrong.) I changed main_list into a set for O(1) lookup.

    >>> main_set = set(main_list)
    >>> in_main = []
    >>> out_main = []
    >>> for p in part_list:
    ...     (in_main if p in main_set else out_main).append(p)
    ... 
    >>> in_main
    [4, 1]
    >>> out_main
    [5, 2, 7]
    

    There's also a fun collections.defaultdict approach:

    >>> import collections
    >>> in_out = collections.defaultdict(list)
    >>> for p in part_list:
    ...     in_out[p in main_list].append(p)
    ... 
    >>> in_out
    defaultdict(<type 'list'>, {False: [5, 2, 7], True: [4, 1]})