I have a below Data Frame and would like to subset the dataframe which has the last business day of the month to separate data frame.
df =
Date Open High Low Close Adj Close Volume
0 2007-06-18 0.33979 0.33979 0.33979 0.33979 0.33979 1591888
1 2007-06-29 0.33074 0.33074 0.33074 0.33074 0.33074 88440
2 2007-06-20 0.33526 0.33526 0.33526 0.33526 0.33526 3538
3 2007-06-21 0.32113 0.32113 0.32113 0.32113 0.32113 3550
4 2007-06-22 0.34713 0.34713 0.34713 0.34713 0.34713 670
df_output =
Date Open High Low Close Adj Close Volume
1 2007-06-29 0.33074 0.33074 0.33074 0.33074 0.33074 88440
I have tried df = pd.to_datetime(df['Date'], format='%Y%m%d') + pd.offsets.BMonthEnd(1)
but is not working and I have looked to here but did not find the right approach. Can anyone help me in solving this.
You could do it like this:
# in case Date is not in datetime format:
df['Date'] = pd.to_datetime(df['Date'])
df_output = df.loc[df.Date.isin(df.Date + pd.offsets.BMonthEnd(1))]
This returns:
>>> df_output
Date Open High Low Close Adj Close Volume
1 2007-06-29 0.33074 0.33074 0.33074 0.33074 0.33074 88440
Explanation:
df.Date + pd.offsets.BMonthEnd(1)
returns a series of the last business day in each month for which you have data:
>>> df.Date + pd.offsets.BMonthEnd(1)
0 2007-06-29
1 2007-07-31
2 2007-06-29
3 2007-06-29
4 2007-06-29
Then, you use loc
to locate any rows in your dataframe in which the actual Date
is in that series (using .isin
)