Search code examples
pandaselementdivisioncalculated-columns

Pandas Dividing DF Column Unique Rows by Different DF Unique Mean


I've been trying to create a column of variability given the mean of the column data values for 'A' and 'B' below. I don't understand how to create the average for each row or element-wise in the panda column by the single data value with the long-term average(s). For example, imagine if have data that looks like this in pandas df1:

Year     Name   Data
1999     A      2
2000     A      4
1999     B      6
2000     B      8

And, i have a DF with the long-term mean called "LTmean", which in this case is = 3 and 7.

mean_df = 
Name    Data mean
0   A          3
1   B          7

So, the result would look like this for a new df: dfnew['var'] = (df1.['Data']/mean_df(???) -1:

Year   Name   Var
1999    A     -0.3
2000    A      0.3
1999    B     -0.14
2000    B      0.14

Thank you for any suggestions on this! Would a loop be the best idea to loop through each column by the "Name' in each DF somehow using the unique mean_df value for A, B?


Solution

  • Use groupby().transform to propagate the results to the original index:

    df['Var'] = df['Data']/df.groupby(['Name'])['Data'].transform('mean') - 1
    

    Output:

       Year Name  Data       Var
    0  1999    A     2 -0.333333
    1  2000    A     4  0.333333
    2  1999    B     6 -0.142857
    3  2000    B     8  0.142857