I have a big data frame with the following information, with column day shows the date in dd/mm/yy
format. For example, as following,
UserID group day sp PU new
0 213 test 6/10/14 4 $628.14 test-Red
1 314 control 6/10/14 15 $29.98 control
2 354 test 13/10/14 4 $554.58 test-NonRed
3 2131 test1 13/10/14 2 $60.41 test-Red
4 314 control1 13/10/14 1 $8.71 control
5 354 test1 20/10/14 1 $165.63 test-NonRed
I need to calculate the incremental spend in 6 weeks for the total test group (test-Red and test-NonRed) versus the control group. I need to show the result in absolute $ or %. What I have tried,
df.groupby(by=['PU','day']).sum().groupby(level=[0]).cumsum()
This gives me absolute sum for groups for each group, but the excepted answer is something like this, with % also for each group
Control $#
Test - NonRed $#
Test - Red $#
Total Result $#
Any help would be great
You need few steps to get there
df.PU=df.PU.str[1:].astype(float)
df.day=pd.to_datetime(df.day)
new1=pd.pivot_table(df,index='new',columns='day',values='PU',aggfunc=sum,fill_value=0,margins=True)
new1=new1.drop('All',1)
new1.cumsum(1)
Out[263]:
day 2014-06-10 00:00:00 2014-10-13 00:00:00 2014-10-20 00:00:00
new
control 29.98 38.69 38.69
test-NonRed 0.00 554.58 720.21
test-Red 628.14 688.55 688.55
All 658.12 1281.82 1447.45