I would like to calculate cumulative rate of return using a dataframe sorted by date and fund name, nav(net asset value).
I tried to solve it using the lambda
function and the np.cumprod
, but I faced two 2 Warning Message.
Overflow problem.
I got know why it happen and tried to solve it using set dtypes
, but it is not working in pandas.
FutureWarning: Not prepending group keys to the result index of transform-like apply. I think this is minor problem so I can solve it.
import pandas as pd
import io
temp = u"""
date,Fund ID,nav
2023-01-05,FUND_A,262877379.16
2023-01-06,FUND_A,263507035.14
2023-01-07,FUND_A,264752531.29
2023-01-08,FUND_A,264293118.56
2023-01-09,FUND_A,265590797.93
2023-01-10,FUND_A,265590775.93
2023-01-11,FUND_A,265590898.93
2023-01-12,FUND_A,265621139.41
2023-01-13,FUND_A,264868632.86
2023-01-14,FUND_A,266123265.88
2023-01-15,FUND_A,267106475.96
2023-01-16,FUND_A,267582033.3
2023-01-17,FUND_A,267581903.3
2023-01-18,FUND_A,267581909.3
2023-01-19,FUND_A,267597565.49
2023-01-20,FUND_A,268828419.07
2023-01-21,FUND_A,269879026.75
2023-01-22,FUND_A,269209075.28
2023-01-23,FUND_A,268023948.37
2023-01-24,FUND_A,268023589.37
2023-01-25,FUND_A,268023331.37
2023-01-26,FUND_A,267874945.91
2023-01-27,FUND_A,268421081.55
2023-01-28,FUND_A,268729376.71
2023-01-29,FUND_A,268293499.95
2023-01-30,FUND_A,268070195.12
2023-01-31,FUND_A,268070570.12
2023-02-01,FUND_A,268069964.12
2023-02-02,FUND_A,267188684.75
2023-02-03,FUND_A,267496712.59
2023-02-04,FUND_A,267526762.57
2023-02-05,FUND_A,271378160.78
2023-02-06,FUND_A,270154112.72
2023-02-07,FUND_A,270153984.72
2023-02-08,FUND_A,270154208.72
2023-02-09,FUND_A,269202371.67
2023-02-10,FUND_A,269034459.46
2023-02-11,FUND_A,268260972.59
2023-02-12,FUND_A,268793639.85
2023-02-13,FUND_A,267819922.47
2023-02-14,FUND_A,267819544.47
2023-02-15,FUND_A,267820160.47
2023-02-16,FUND_A,267501280.07
2023-02-17,FUND_A,266718816.91
2023-02-18,FUND_A,266318793.92
2023-02-19,FUND_A,266100701.19
2023-02-20,FUND_A,266357789.28
2023-02-21,FUND_A,266358292.28
2023-02-22,FUND_A,266358374.28
2023-02-23,FUND_A,266297038.11
2023-02-24,FUND_A,265152593.68
2023-02-25,FUND_A,265280938.19
2023-02-26,FUND_A,265753112.99
2023-02-27,FUND_A,264823479.12
2023-02-28,FUND_A,264823962.12
2023-01-05,FUND_B,354997525.29
2023-01-06,FUND_B,355716206.9
2023-01-07,FUND_B,357328327.64
2023-01-08,FUND_B,356755368.55
2023-01-09,FUND_B,358525763.27
2023-01-10,FUND_B,358525715.27
2023-01-11,FUND_B,358525829.27
2023-01-12,FUND_B,358560032.03
2023-01-13,FUND_B,357544775.85
2023-01-14,FUND_B,359237768.65
2023-01-15,FUND_B,360513139.37
2023-01-16,FUND_B,361224708.97
2023-01-17,FUND_B,361224452.97
2023-01-18,FUND_B,361224357.97
2023-01-19,FUND_B,361307618.45
2023-01-20,FUND_B,362992630.7
2023-01-21,FUND_B,364502776.04
2023-01-22,FUND_B,363667782.2
2023-01-23,FUND_B,362172298.58
2023-01-24,FUND_B,362172196.58
2023-01-25,FUND_B,362172061.58
2023-01-26,FUND_B,362053109.42
2023-01-27,FUND_B,362853781.39
2023-01-28,FUND_B,363344529.15
2023-01-29,FUND_B,362846268.28
2023-01-30,FUND_B,362601837.45
2023-01-31,FUND_B,362601459.45
2023-02-01,FUND_B,362601711.45
2023-02-02,FUND_B,361458780.23
2023-02-03,FUND_B,361943215.36
2023-02-04,FUND_B,362075761.92
2023-02-05,FUND_B,367337579.83
2023-02-06,FUND_B,365773518.56
2023-02-07,FUND_B,365773899.56
2023-02-08,FUND_B,365774045.56
2023-02-09,FUND_B,364551357.85
2023-02-10,FUND_B,364338309.96
2023-02-11,FUND_B,363294040.03
2023-02-12,FUND_B,364033223.52
2023-02-13,FUND_B,362717028.06
2023-02-14,FUND_B,362716228.06
2023-02-15,FUND_B,362716588.06
2023-02-16,FUND_B,362315772.87
2023-02-17,FUND_B,361310193.74
2023-02-18,FUND_B,360777411.01
2023-02-19,FUND_B,360524129.16
2023-02-20,FUND_B,360859219.16
2023-02-21,FUND_B,360859566.16
2023-02-22,FUND_B,360860075.16
2023-02-23,FUND_B,360794922.2
2023-02-24,FUND_B,359270093.58
2023-02-25,FUND_B,359443774.88
2023-02-26,FUND_B,360108483.71
2023-02-27,FUND_B,358814690.62
2023-02-28,FUND_B,358814683.62
"""
test = pd.read_csv(io.StringIO(temp), sep=",", parse_dates=False)
test['date'] = pd.to_datetime(test['date'], format ="%Y-%m-%d")
test['cum_return'] = test.groupby('Fund ID')['nav'].apply(lambda x: np.cumprod((1 + x)) - 1)
Warning message:
C:\Users\SAMSUNG\anaconda3\Lib\site-packages\numpy\core\fromnumeric.py:57: RuntimeWarning: overflow encountered in accumulate
return bound(*args, **kwds)
C:\Users\SAMSUNG\AppData\Local\Temp\ipykernel_15364\496371556.py:1: FutureWarning: Not prepending group keys to the result index of transform-like apply. In the future, the group keys will be included in the index, regardless of whether the applied function returns a like-indexed object.
To preserve the previous behavior, use
.groupby(..., group_keys=False)
To adopt the future behavior and silence this warning, use
.groupby(..., group_keys=True)
test['cum_return'] = test.groupby('Fund ID')['nav'].apply(lambda x: np.cumprod((1 + x)) - 1)
Use GroupBy.cumprod
instead groupby.apply
:
test['daily_return'] = test.groupby('Fund ID')['nav'].pct_change()
test['cum_return'] = test['daily_return'].add(1).groupby(test['Fund ID']).cumprod().sub(1)
Alternative with assign column with add 1
:
test['cum_return'] = (test.assign(daily_return = test['daily_return'] + 1)
.groupby('Fund ID')['daily_return'].cumprod().sub(1))