Search code examples
pythonlistnestedsubsettraversal

python: subset a nested list if matches in another list


Say I have represented a n row x m columns-matrix as a nested python list, e.g. 3 rows and 2 columns:

m = [ [1,2,3], ['a', 'b', 'c'] ]

What will be a generic and pythonic way to generate another k x m-matrix (k <= n) with rows where values in, say, the second column have matches in a sequence (that is a subset of m). Thus for the sequence below there are matches for 'a' and 'c':

s = ['j', 'a', 'c', 'e']

The resulting matrix m2should be

m2 = [ [1,3], ['a','c'] ]

What did not work:

My stupid try was something along (which generate error/did not work, and is less scalable for many columns):

m2 = [ [x, y] for x, y in m if y in s ]

Solution

  • You can zip the rows of m to identify columns whose second item is present in s, and zip the columns again to output the rows:

    list(zip(*(c for c in zip(*m) if c[1] in s)))
    

    This returns:

    [(1, 3), ('a', 'c')]
    

    If you need the output to be a list of lists, you can map the tuples generated by zip to list:

    list(map(list, list(zip(*(c for c in zip(*m) if c[1] in s)))))
    

    This returns:

    [[1, 3], ['a', 'c']]
    

    You can optionally make s a set first to improve lookup efficiency if there are many items in s:

    s = set(s)