I've been trying to create a column of variability given the mean of the column data values for 'A' and 'B' below. I don't understand how to create the average for each row or element-wise in the panda column by the single data value with the long-term average(s). For example, imagine if have data that looks like this in pandas df1:
Year Name Data
1999 A 2
2000 A 4
1999 B 6
2000 B 8
And, i have a DF with the long-term mean called "LTmean", which in this case is = 3 and 7.
mean_df =
Name Data mean
0 A 3
1 B 7
So, the result would look like this for a new df: dfnew['var'] = (df1.['Data']/mean_df(???) -1:
Year Name Var
1999 A -0.3
2000 A 0.3
1999 B -0.14
2000 B 0.14
Thank you for any suggestions on this! Would a loop be the best idea to loop through each column by the "Name' in each DF somehow using the unique mean_df value for A, B?
Use groupby().transform
to propagate the results to the original index:
df['Var'] = df['Data']/df.groupby(['Name'])['Data'].transform('mean') - 1
Output:
Year Name Data Var
0 1999 A 2 -0.333333
1 2000 A 4 0.333333
2 1999 B 6 -0.142857
3 2000 B 8 0.142857