I have two data frames like below:-
data ={'a':[1,2,3,4,5,6,7],'b':[2,5,3,6,1,7,4]}
df = pd.DataFrame(data)
df
a b
0 1 2
1 2 5
2 3 3
3 4 6
4 5 1
5 6 7
6 7 4
data ={'a':[4,2,1,7,3,6,5],'c':[12,15,13,60,100,700,400]}
df1 = pd.DataFrame(data)
df1
a c
0 4 12
1 2 15
2 1 13
3 7 60
4 3 100
5 6 700
6 5 400
Now I want a list of values using the above two data frames. Basically, I want to search the df
row values in df1
and get the corresponding value (column c
of df1
) and append them into a list by taking the average of both numbers.
However, I was able to do it but as I am iterating over rows it is taking time. Is there any better way to get the solution much faster?
code:
final=[]
for index, row in edges.iterrows():
for inde2x, row2 in df1.iterrows():
if np.isin(row['a'],row2['a']) == True:
r1 = row2['c']
if np.isin(row['b'],row2['a']) == True:
r2 = row2['c']
final.apped(r1+r2)
print(final)
Excepted output:
[28, 415, 200, 712, 413, 760, 72]
One way is stack the values of df
, map it using the a
columns from df1
:
out = df.stack().map(df1.set_index('a')['c']).groupby(level=0).sum()
Output:
0 28
1 415
2 200
3 712
4 413
5 760
6 72
dtype: int64
If you need a list, you can do out.tolist()