I am trying to create a function that will return rows pertaining only to particular month and year:
df
order_date Type
2015-01-01 A
2017-09-01 A
2016-12-19 C
2019-11-23 D
2018-10-29 B
2017-12-31 B
2015-11-30 A
2015-08-30 B
2015-09-24 D
2015-01-27 E
Defining function
def return_data_month_year(month, year):
month = pd.to_datetime(month).month()
year = pd.to_datetime(year).year()
df = df[((df['order_date']).dt.strftime('%B') == month)&((df['order_date']).dt.strftime('%Y') ==
year)]
return df
Calling function
return_data_month_year('Jan','2015')
Expected output:
order_date Type
2015-01-01 A
2015-01-27 E
I am getting error(Output):
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00
You don't have to call month = pd.to_datetime(month).month()
and year = pd.to_datetime(year).year()
.
Also '%B'
returns full month name, eg. January
. To return only abbreviation (Jan
, Feb
, ...), use %b
:
def return_data_month_year(df, month, year):
return df[((df['order_date']).dt.strftime('%b') == month)&((df['order_date']).dt.strftime('%Y') == year)]
# to convert column 'order_date' to datetime:
df['order_date'] = pd.to_datetime( df['order_date'] )
print( return_data_month_year(df, 'Jan','2015') )
Prints:
order_date Type
0 2015-01-01 A
9 2015-01-27 E