Creating multiple cohort from the pivot table

I have a requirement like below.

The initial information is a list of gross adds.

201910	201911	201912	202001	202002
20000	30000	32000	40000	36000

I have a pivot table as below.

201910	201911	201912	202001	202002
1000	2000	2400	3200	1800
500	400	300	200	nan
200	150	100	nan	nan
200	100	nan	nan	nan
160	nan	nan	nan	nan

Need to generate the report like below.

Cohort01:

5%
3%
3%
1%
1%
1%

From Cohort02 onwards it will take the average of last value of cohort01. Similarly for Cohort03 for both nan values it will take the average of corresponding values of cohort01 and cohort2. Again while calculating for cohort04 it will take the average of previous two cohorts(cohort02 and cohort03 values) to add all three nan value.

Is there anyone who can provide me a solution on this in Python.

The report should be generated as below.

All cohorts should be created separately.

Solution

You could try it like this:

res = df.apply(lambda x: round(100/(df_gross.iloc[0]/x),1),axis=1)
print(res)

   201910  201911  201912  202001  202002
0     5.0     6.7     7.5     8.0     5.0
1     2.5     1.3     0.9     0.5     NaN
2     1.0     0.5     0.3     NaN     NaN
3     1.0     0.3     NaN     NaN     NaN
4     0.8     NaN     NaN     NaN     NaN

for idx,col in enumerate(res.columns[1:],1):
    res[col] = res[col].fillna((res.iloc[:,max(idx-2,0)]+res.iloc[:,idx-1])/2)
print(res)

   201910  201911  201912  202001  202002
0     5.0     6.7    7.50   8.000  5.0000
1     2.5     1.3    0.90   0.500  0.7000
2     1.0     0.5    0.30   0.400  0.3500
3     1.0     0.3    0.65   0.475  0.5625
4     0.8     0.8    0.80   0.800  0.8000