Search code examples
pythonpandaspandas-groupbyyfinance

Get last example of pct_change from pandas groupby apply?


I am using financial data from yfinance.

df = yf.download("AAPL MSFT", period="60D", interval="5m")

The data looks like this:

                                 AAPL        MSFT
Datetime                                         
2021-09-17 09:30:00-04:00  147.880005  303.049103
2021-09-17 09:35:00-04:00  147.669998  302.510010
2021-09-17 09:40:00-04:00  147.470001  303.000000
2021-09-17 09:45:00-04:00  147.320007  302.739990
2021-09-17 09:50:00-04:00  147.179993  301.869995
2021-12-10 15:35:00-05:00  178.179993  341.500000
2021-12-10 15:40:00-05:00  178.339996  341.437012
2021-12-10 15:45:00-05:00  178.460007  341.049988
2021-12-10 15:50:00-05:00  179.059998  342.220001
2021-12-10 15:55:00-05:00  179.500000  342.230011

I want to calculate daily(bussiness-days) percentage change.

df.groupby(by=pd.Grouper(freq="B")).apply(lambda x: x.pct_change(periods=len(x)-1).iloc[-1,:])

It works OK without selecting the last example with iloc. When I use iloc to get the last example of the pct_change(the only relevant one for me) I get this error:

IndexError: single positional indexer is out-of-bounds

even though the object is a Data-frame with examples.


Solution

  • Maybe you should use dropna instead of iloc

    pct_change = lambda x: x.pct_change(periods=len(x)-1)
    out = df.groupby(by=pd.Grouper(freq="B")).apply(pct_change).dropna()
    print(out)
    
    # Output:
                                   AAPL      MSFT
    Datetime                                     
    2021-09-17 15:55:00-04:00 -0.012510 -0.010655
    2021-09-20 15:55:00-04:00 -0.009423 -0.007445
    2021-09-21 15:55:00-04:00  0.000320  0.000577
    ...
    2021-12-08 15:55:00-05:00  0.021300  0.003535
    2021-12-09 15:55:00-05:00 -0.010876 -0.007657
    2021-12-10 15:55:00-05:00  0.019597  0.013745