I would like to generate date range based on any two dates. For example:
start_date = "2021-01-15"
end_date = "2021-04-15"
pd.date_range(start_date, end_date, freq="M")
This results in DatetimeIndex(['2021-01-31', '2021-02-28', '2021-03-31'], dtype='datetime64[ns]', freq='M')
.
I would like to update this to get all months including those present in start_date
and end_date
. Therefore, I would like to get something like DatetimeIndex(['2021-01-31', '2021-02-28', '2021-03-31', '2021-04-30'], dtype='datetime64[ns]', freq='M')
(note '2021-04-30'
since April is present in end_date
).
I know there are other frequency options (e.g. here), I can try freq="MS"
(month starts) which will include the last month (April) but the first one (Jan) will be missing.
I understand that pd.date_range
always care about days, it makes sense, but since this is not important in my case (I do not care about days, only about months, does not matter whether output will be like 2021-01-01
or 2021-01-31
or even e.g. 2021-01-11
), I'm wondering whether there is any simple solution that could work?
Use period_range
if working only with months:
per = pd.period_range(start_date, end_date, freq="M")
print (per)
PeriodIndex(['2021-01', '2021-02', '2021-03', '2021-04'], dtype='period[M]', freq='M')
And for convert to datetimes is possible use PeriodIndex.to_timestamp
with DatetimeIndex.normalize
:
print (per.to_timestamp(how='end').normalize())
DatetimeIndex(['2021-01-31', '2021-02-28', '2021-03-31', '2021-04-30'],
dtype='datetime64[ns]', freq='M')