Search code examples
pythonpandaspython-datetime

Working with more than one datetime format in python


Below is the sample data

Datetime

11/19/2020 9:48:50 AM
12/17/2020 2:41:02 PM
2020-02-11 14:44:58
2020-28-12 10:41:02
2020-05-12 06:31:39

11/19/2020 is in mm/dd/yyyy whereas 2020-28-12 is yyyy-dd-mm.

After applying pd.to_datetime below is the output that I am getting.

Date
2020-11-19 09:48:50
2020-12-17 22:41:02
2020-02-11 14:44:58
2020-28-12 10:41:02
2020-05-12 06:31:39

If the input data is coming with slash (/) i.e 11/19/2020 then format is mm/dd/yyyy in input itself and when data is coming with dash (-) i.e 2020-02-11 then the format is yyyy-dd-mm. But after applying pd.to_datetime datetime is getting interchanged.

The first two output is correct. The bottom three needs to be corrected as

2020-11-02 14:44:58
2020-12-28 10:41:02
2020-12-05 06:31:39

Please suggest to have common format i.e yyyy-mm-dd format.


Solution

  • Use to_datetime with specify both formats and errors='coerce' for missing values if no match and then replace them by another Series in Series.fillna:

    d1 = pd.to_datetime(df['datetime'], format='%Y-%d-%m %H:%M:%S', errors='coerce')
    d2 = pd.to_datetime(df['datetime'], format='%m/%d/%Y %I:%M:%S %p', errors='coerce')
    
    df['datetime'] = d1.fillna(d2)
    print (df)
                 datetime
    0 2020-11-19 09:48:50
    1 2020-12-17 14:41:02
    2 2020-11-02 14:44:58
    3 2020-12-28 10:41:02
    4 2020-12-05 06:31:39