I like to filter a data frame dynamically by a dict
that is holding a list
of dicts
with column
and value
information to filter on.
Expected result should look like this:
result = {'brand': ['Honda','Honda','Toyota'],
'model': ['a','b','d']
}
cars = {'brand': ['Honda','Honda','Honda','Toyota','Toyota','Audi'],
'model': ['a','b','c','d','a','b'],
}
Is holding information about relation between cars:
relation = {
'relation1':[
{'brand':'Honda','model':'a'},
{'brand':'Honda','model':'b'},
{'brand':'Toyota','model':'d'}
],
'relation2':[
{'brand':'...','model':'...'},
{'brand':'...','model':'...'}
]
}
' | '.join([f'{repr(k)}=={repr(v)}' for d in relation['relation1'] for k,v in d.items()])
Implement filter will give an error:
KeyError: "brand=='Honda' | model=='a' | brand=='Honda' | model=='b' | brand=='Toyota' | model=='d'"
Adjusting filter to combine the dict values inside with &
and outside with |
so filter looks like:
(brand=='Honda' & model=='a') | (brand=='Honda' & model=='b') | (brand=='Toyota' & model=='d')
Use nested list comprehension:
q = ' | '.join([f"({' & '.join(f'{k}=={repr(v)}' for k,v in d.items())})"
for d in relation['relation1']])
print (q)
(brand=='Honda' & model=='a') |
(brand=='Honda' & model=='b') |
(brand=='Toyota' & model=='b')
df = df.query(q)
print (df)
brand model
0 Honda a
1 Honda b
3 Toyota d