Search code examples
pythoniterable-unpacking

Iterate over a pair of iterables, sorted by an attribute


One way (the fastest way?) to iterate over a pair of iterables a and b in sorted order is to chain them and sort the chained iterable:

for i in sorted(chain(a, b)):
    print i

For instance, if the elements of each iterable are:

a: 4, 6, 1
b: 8, 3

then this construct would produce elements in the order

1, 3, 4, 6, 8

However, if the iterables iterate over objects, this sorts the objects by their memory address. Assuming each iterable iterates over the same type of object,

  1. What is the fastest way to iterate over a particular attribute of the objects, sorted by this attribute?

  2. What if the attribute to be chosen differs between iterables? If iterables a and b both iterate over objects of type foo, which has attributes foo.x and foo.y of the same type, how could one iterate over elements of a sorted by x and b sorted by y?

For an example of #2, if

a: (x=4,y=3), (x=6,y=2), (x=1,y=7)
b: (x=2,y=8), (x=2,y=3)

then the elements should be produced in the order

1, 3, 4, 6, 8

as before. Note that only the x attributes from a and the y attributes from b enter into the sort and the result.


Solution

  • Tim Pietzcker has already answered for the case where you're using the same attribute for each iterable. If you're using different attributes of the same type, you can do it like this (using complex numbers as a ready-made class that has two attributes of the same type):

    In Python 2:

    >>> a = [1+4j, 7+0j, 3+6j, 9+2j, 5+8j]
    >>> b = [2+5j, 8+1j, 4+7j, 0+3j, 6+9j]
    >>> keyed_a = ((n.real, n) for n in a)
    >>> keyed_b = ((n.imag, n) for n in b)
    >>> from itertools import chain
    >>> sorted_ab = zip(*sorted(chain(keyed_a, keyed_b), key=lambda t: t[0]))[1]
    >>> sorted_ab
    ((1+4j), (8+1j), (3+6j), 3j, (5+8j), (2+5j), (7+0j), (4+7j), (9+2j), (6+9j))
    

    Since in Python 3 zip() returns an iterator, we need to coerce it to a list before attempting to subscript it:

    >>> # ... as before up to 'from itertools import chain'
    >>> sorted_ab = list(zip(*sorted(chain(keyed_a, keyed_b), key=lambda t: t[0])))[1]
    >>> sorted_ab
    ((1+4j), (8+1j), (3+6j), 3j, (5+8j), (2+5j), (7+0j), (4+7j), (9+2j), (6+9j))