Search code examples
pythonpandaslistdataframerows

Alternative way instead of iterating over rows for getting a list of average values


I have two data frames like below:-

data ={'a':[1,2,3,4,5,6,7],'b':[2,5,3,6,1,7,4]}
df = pd.DataFrame(data)
df
    a   b
0   1   2
1   2   5
2   3   3
3   4   6
4   5   1
5   6   7
6   7   4
data ={'a':[4,2,1,7,3,6,5],'c':[12,15,13,60,100,700,400]}
df1 = pd.DataFrame(data)
df1
    a   c
0   4   12
1   2   15
2   1   13
3   7   60
4   3   100
5   6   700
6   5   400

Now I want a list of values using the above two data frames. Basically, I want to search the df row values in df1 and get the corresponding value (column c of df1) and append them into a list by taking the average of both numbers.

However, I was able to do it but as I am iterating over rows it is taking time. Is there any better way to get the solution much faster?

code:

final=[]
for index, row in edges.iterrows():
    for inde2x, row2 in df1.iterrows():
        if np.isin(row['a'],row2['a']) == True: 
            r1 = row2['c']
        if np.isin(row['b'],row2['a']) == True:
            r2 = row2['c']
    final.apped(r1+r2)
print(final)

Excepted output:

[28, 415, 200, 712, 413, 760, 72]

Solution

  • One way is stack the values of df, map it using the a columns from df1:

    out = df.stack().map(df1.set_index('a')['c']).groupby(level=0).sum()
    

    Output:

    0     28
    1    415
    2    200
    3    712
    4    413
    5    760
    6     72
    dtype: int64
    

    If you need a list, you can do out.tolist()