Say I have represented a n row x m columns-matrix as a nested python list, e.g. 3 rows and 2 columns:
m = [ [1,2,3], ['a', 'b', 'c'] ]
What will be a generic and pythonic way to generate another k x m-matrix (k <= n) with rows where values in, say, the second column have matches in a sequence (that is a subset of m
).
Thus for the sequence below there are matches for 'a' and 'c':
s = ['j', 'a', 'c', 'e']
The resulting matrix m2
should be
m2 = [ [1,3], ['a','c'] ]
What did not work:
My stupid try was something along (which generate error/did not work, and is less scalable for many columns):
m2 = [ [x, y] for x, y in m if y in s ]
You can zip
the rows of m
to identify columns whose second item is present in s
, and zip
the columns again to output the rows:
list(zip(*(c for c in zip(*m) if c[1] in s)))
This returns:
[(1, 3), ('a', 'c')]
If you need the output to be a list of lists, you can map the tuples generated by zip
to list
:
list(map(list, list(zip(*(c for c in zip(*m) if c[1] in s)))))
This returns:
[[1, 3], ['a', 'c']]
You can optionally make s
a set first to improve lookup efficiency if there are many items in s
:
s = set(s)