Search code examples
pythontime-series

Python datetime with months and year only


I'm trying to convert my dataset variable 'DATE' into a %Y.%m format but having the datatype as 'datetime'.

Now, I've try to do this:

testdata['DATE'] = pd.to_datetime(testdata['DATE'], format='%Y-%m')

But it returns me a format of '%Y-%m-%d'. Can someone explain to me why?

(The reason why I'm trying to do this is because I'm dealing with a time series algorithm for machine learning. And I THINK that I would need my date to be a datetime, but I only have month plus year data.


Solution

  • The reason why the format returned is '%Y-%m-%d' is because pandas, by default, fills in the missing day and time values with the minimum possible values, which are '01' for the day and '00:00:00' for the time.

    However, if you don't need the day and time information in your analysis, you can simply extract the year and month from the datetime object and create a new column with the desired format using the strftime method:

    testdata['YEAR_MONTH'] = testdata['DATE'].dt.strftime('%Y.%m')
    

    This will create a new column called 'YEAR_MONTH' with the format '%Y.%m', containing only the year and month information.