I have some lists of text and I need to pull them together into a dictionary, but I need to use one list to 'filter' the other. I can do this in a series of nested for loops but I can not make it work with a dict comprehension.
a = ['Complete Yes', 'Title Mr', 'Forename John', 'Initial A', 'Surname Smith', 'Date of Birth 01 01 1901']
b = ['Forename', 'Surname', 'Date of birth']
If I try to make a dict of the needed details with nested for loops
it works fine
details = {}
for x in b:
for l in a:
if x in l:
details[x] = l
details
I get
{'Forename': 'Forename John',
'Surname': 'Surname Smith',
'Date of birth': 'Date of birth 01 01 1901'}
which needs cleaning up but I can do that later.
When I try it with a dict comprehension
d_tails = {x:l for x,l in zip(b, [l for l in a if x in l]) }
I get
{'Forename': 'Date of birth 01 01 1901'}
I'm sure this is because of how I'm ordering the dict comprehension but I can't figure out how to order it so that it replaces the for loop.
For context I'm trying to clean really messy data for terrible pdfs that where a
comes from. Any help on this would be appreciated.
Let's consider simpler examples of two lists:
a = [1,2,3]
b = ['a', 'b', 'c']
for x in a:
for y in b:
print(x, y)
This produces 9 lines of output, one for every possible combination of a value from a
and a value from b
.
for x, y in zip(a, b):
print(x, y)
This produces only 3 lines of output: one for every corresponding pair of values taking one from a
and one from b
.
If you want to convert your nested loop into a single dict comprehension, you need two generators, not a single generator iterating over a zip
object.
details = {x: l for x in b for l in a if x in l}