Search code examples
pythonpandasdatefinance

Sub setting the Last Business Day of Month for data frame in Python


I have a below Data Frame and would like to subset the dataframe which has the last business day of the month to separate data frame.

df = Date Open High Low Close Adj Close Volume 0 2007-06-18 0.33979 0.33979 0.33979 0.33979 0.33979 1591888 1 2007-06-29 0.33074 0.33074 0.33074 0.33074 0.33074 88440 2 2007-06-20 0.33526 0.33526 0.33526 0.33526 0.33526 3538 3 2007-06-21 0.32113 0.32113 0.32113 0.32113 0.32113 3550 4 2007-06-22 0.34713 0.34713 0.34713 0.34713 0.34713 670

df_output =

Date Open High Low Close Adj Close Volume 1 2007-06-29 0.33074 0.33074 0.33074 0.33074 0.33074 88440

I have tried df = pd.to_datetime(df['Date'], format='%Y%m%d') + pd.offsets.BMonthEnd(1) but is not working and I have looked to here but did not find the right approach. Can anyone help me in solving this.


Solution

  • You could do it like this:

    # in case Date is not in datetime format:
    df['Date'] = pd.to_datetime(df['Date'])
    
    df_output = df.loc[df.Date.isin(df.Date + pd.offsets.BMonthEnd(1))]
    

    This returns:

    >>> df_output
            Date     Open     High      Low    Close  Adj Close  Volume
    1 2007-06-29  0.33074  0.33074  0.33074  0.33074    0.33074   88440
    

    Explanation:

    df.Date + pd.offsets.BMonthEnd(1) returns a series of the last business day in each month for which you have data:

    >>> df.Date + pd.offsets.BMonthEnd(1)
    0   2007-06-29
    1   2007-07-31
    2   2007-06-29
    3   2007-06-29
    4   2007-06-29
    

    Then, you use loc to locate any rows in your dataframe in which the actual Date is in that series (using .isin)