Search code examples
pythonpandasdatedate-formatting

How to convert two different date formats from a pandas dataframe column into same format?


I have two different date formats in a pandas column such as - DD-MM-YYYY and MM/DD/YYYY and I want to convert them into the same format.

I tried using the code -

data['SALE DATE'] = pd.to_datetime(data['SALE DATE']).dt.strftime('%m/%d/%Y')

but this converts the dates into DD/MM/YYYY and MM/DD/YYYY into the output - data['SALE DATE']

enter image description here

I want a python solution to overcome this problem. Any leads will be very helpful.


Solution

  • The most intuitive solution is to write a custom conversion function, someting like:

    def myDateConv(tt):
        sep = tt[2]
        if sep == '-':
            return pd.to_datetime(tt, format='%d-%m-%Y')
        elif sep == '/':
            return pd.to_datetime(tt, format='%m/%d/%Y')
        else:
            return tt
    

    and then pass it as a converter for the column in question:

    df = pd.read_csv('Input.csv', converters={'Date': myDateConv})
    

    I prepared a CSV file, which read with read_csv without any custom converter gave the original content and both columns of object type:

             Date Input format
    0  03-05-2020   DD-MM-YYYY
    1  05/07/2020   MM/DD/YYYY
    

    But reading the same file with the above converter gave:

            Date Input format
    0 2020-05-03   DD-MM-YYYY
    1 2020-05-07   MM/DD/YYYY
    

    with Date column of datetime64[ns] type and both dates from May, just as intended.

    Or if you have this DataFrame from other source and you want to convert this column, run:

    df.Date = df.Date.apply(myDateConv)