Search code examples
pandasseriespython-datetime

Formatting error on converting Pandas series to datetime format


I am trying to convert this series of dates into a datetime format, however I keep getting the error that the format does not match:

ValueError: time data '12-Feb-10' does not match format '%d-%b-%Y' (match)

holiday_list_0 = pd.Series(['12-Feb-10', '11-Feb-11', '10-Feb-12', '8-Feb-13','10-Sep-10', '9-Sep-11', '7-Sep-12'\
                , '6-Sep-13','26-Nov-10', '25-Nov-11', '23-Nov-12', '29-Nov-13','31-Dec-10', '30-Dec-11'\
                , '28-Dec-12', '27-Dec-13'])

pd.to_datetime(holiday_list_0, format='%d-%b-%Y')

I can't seem to find why


Solution

  • Use %y for match YY format:

    out = pd.to_datetime(holiday_list_0, format='%d-%b-%y')
    

    This is also working:

    out = pd.to_datetime(holiday_list_0)
    

    If specify format it is a bit faster in large DataFrames:

    #160k
    holiday_list_0 = pd.Series(['12-Feb-10', '11-Feb-11', '10-Feb-12', '8-Feb-13','10-Sep-10', '9-Sep-11', '7-Sep-12'\
                    , '6-Sep-13','26-Nov-10', '25-Nov-11', '23-Nov-12', '29-Nov-13','31-Dec-10', '30-Dec-11'\
                    , '28-Dec-12', '27-Dec-13'] * 10000)
    
    
    In [37]: %timeit pd.to_datetime(holiday_list_0)
    28.2 ms ± 2.15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    In [38]: %timeit pd.to_datetime(holiday_list_0, format='%d-%b-%y')
    21.1 ms ± 552 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)