Search code examples
pythoncsvdatedatetime-parsing

Dealing with different date formats in python


So I have a issue around dates that are coming from a excel sheet which I'm transforming into a CSV and then loading into a data frame. Basically the data I'm dealing with each day can come in two different formats. These two date columns are called Appointment Date and Attended Date

I'm dealing with (DD/MM/YYYY HH:MM) and (YYYY/MM/DD HH:MM) and its coming from a third party so I cant set the date format structure. What i need to do is parse the data and remove the HH:MM and output the data only has DD/MM/YYYY.

My current code is currently the following:

df['Appointment Date'] = df['Appointment Date'].str.replace(' ', '/', regex=True)
df['Attended Date'] = df['Attended Date'].str.replace(' ', '/', regex=True)
df['Appointment Date'] = pd.to_datetime(df['Appointment Date'], format="%d/%m/%Y/%H:%M").dt.strftime("%d/%m/%Y")
df['Attended Date'] = pd.to_datetime(df['Attended Date'], format="%d/%m/%Y/%H:%M").dt.strftime("%d/%m/%Y")

But I'm not able to parse the data when it comes through as YYYY/MM/DD HH:MM Exception error:

time data '2021-10-08/00:00:00' does not match format '%d/%m/%Y/%H:%M' (match)

Any ideas on how i can get around this?


Solution

  • Try it one way, and if it doesn't work, try it the other way.

    try:
        df['Appointment Date'] = pd.to_datetime(df['Appointment Date'], format="%d/%m/%Y/%H:%M:%S").dt.strftime("%d/%m/%Y")
    except WhateverDateParseException:
        df['Appointment Date'] = pd.to_datetime(df['Appointment Date'], format="%Y/%m/%d/%H:%M:%S").dt.strftime("%d/%m/%Y")
    

    Of course, instead of WhateverDateParseException use the actual exception that is raised in your code.

    Edit: fixed missing "%S"