Search code examples
pythonpandasdataframedatetimestrftime

How can I convert a Pandas dataframe column to datetime when the format for fractional seconds is inconsistent?


I have a Pandas dataframe which looks like:

import pandas as pd
df = pd.DataFrame({'datetime':[
    'Jan 20, 2000 12:00:00', 
    'Jan 20, 2000 12:00:00.5', 
    'Jan 20, 2000 12:00:01'
]})
df
                  datetime
0    Jan 20, 2000 12:00:00
1  Jan 20, 2000 12:00:00.5
2    Jan 20, 2000 12:00:01

I want to convert the column to datetime format. I tried:

pd.to_datetime(df['datetime'], format='%b %d %Y %H:%M:%S')

...but the .5 fractional second in row 1 causes the conversion to fail. I next tried:

pd.to_datetime(df['datetime'], format='%b %d %Y %H:%M:%S.%f')

...but this causes rows 0 and 2 to fail due to absence of a fractional second.

How can I convert a Pandas dataframe column to datetime format when some entries contain fractional seconds and some do not?


Solution

  • You can use format='mixed' to infer the the format of each element.

    df['datetime'] = pd.to_datetime(df['datetime'], format='mixed')
    

    Note: 'mixed' can be risky as it may infer a datetime in an unpredictable way. Best practice may be to explicitly normalize your dateset's datetime string formats before converting:

    df.loc[~df['datetime'].str.match('.*\.\d+$'), 'datetime'] += '.0'