I have two tables
df1 = pd.DataFrame([{'a': 1}, {'a': 2}, {'a': 8}])
df1['b'] = ""
df2 = pd.DataFrame([{'e': [1,2,3], 'f': 1},{'e': [4,5,6], 'f': 2},{'e': [7,8,9], 'f': 3}])
e f
I would like to insert a value into df1['b'] from df2['f'] based on the condition that value in df['a'] is somewhere in df2['e']
Mostly I would use apply/map
df1['a'].apply(lambda x: function)
I can get the specific value I need
df2['f'].loc[df2['e'].apply(lambda y: value_to_look_for in y)].item()
but when I put it together
df1['a'].apply(lambda x: (df2['f'].loc[df2['e'].map(lambda y:x in y)].item()))
I'm getting an error "ValueError: can only convert an array of size 1 to a Python scalar"
Can I ask for a solution, what am I missing?
I would explode
, create a mapping Series, and map
:
df1['b'] = df1['a'].map(df2.explode('e').set_index('e')['f'])
Output:
a b
0 1 1
1 2 1
2 8 3
Intermediate mapping Series:
df2.explode('e').set_index('e')['f']
e
1 1
2 1
3 1
4 2
5 2
6 2
7 3
8 3
9 3
Name: f, dtype: int64