I have a requirement like below.
The initial information is a list of gross adds.
201910 | 201911 | 201912 | 202001 | 202002 |
---|---|---|---|---|
20000 | 30000 | 32000 | 40000 | 36000 |
I have a pivot table as below.
201910 | 201911 | 201912 | 202001 | 202002 |
---|---|---|---|---|
1000 | 2000 | 2400 | 3200 | 1800 |
500 | 400 | 300 | 200 | nan |
200 | 150 | 100 | nan | nan |
200 | 100 | nan | nan | nan |
160 | nan | nan | nan | nan |
Need to generate the report like below.
Cohort01:
5% |
---|
3% |
3% |
1% |
1% |
1% |
From Cohort02 onwards it will take the average of last value of cohort01. Similarly for Cohort03 for both nan values it will take the average of corresponding values of cohort01 and cohort2. Again while calculating for cohort04 it will take the average of previous two cohorts(cohort02 and cohort03 values) to add all three nan value.
Is there anyone who can provide me a solution on this in Python.
The report should be generated as below.
All cohorts should be created separately.
You could try it like this:
res = df.apply(lambda x: round(100/(df_gross.iloc[0]/x),1),axis=1)
print(res)
201910 201911 201912 202001 202002
0 5.0 6.7 7.5 8.0 5.0
1 2.5 1.3 0.9 0.5 NaN
2 1.0 0.5 0.3 NaN NaN
3 1.0 0.3 NaN NaN NaN
4 0.8 NaN NaN NaN NaN
for idx,col in enumerate(res.columns[1:],1):
res[col] = res[col].fillna((res.iloc[:,max(idx-2,0)]+res.iloc[:,idx-1])/2)
print(res)
201910 201911 201912 202001 202002
0 5.0 6.7 7.50 8.000 5.0000
1 2.5 1.3 0.90 0.500 0.7000
2 1.0 0.5 0.30 0.400 0.3500
3 1.0 0.3 0.65 0.475 0.5625
4 0.8 0.8 0.80 0.800 0.8000