Search code examples
pythonpandasdatetimeutciso8601

Get UTC 'T' and 'Z' to show in DataFrame Column


Importing a dataframe in one datetime format, feeding into API service that requires dates in this UTC format (notice the T and Z):

2023-07-26T11:04:23.893Z

Noteworthy is this will be converted into JSON so the final answer can result in a string. But would be much cleaner solution if the native Pandas time handling can do it.

On individual dates, not in a DataFrame, I've done it in this manner:

due_date_end = datetime.now() + relativedelta(months=+3)
due_date_end = due_date_end.isoformat('T') + 'Z'

When I try using the .isoformat() method on a df column I get an exception.

I've also tried the following:

Parsing dates when reading the file

df = pd.read_csv('my_test_file.csv',parse_dates=['job_due_date'])

Converting using related answers I've seen on SO:

df['due_date'] = pd.to_datetime(end_user_df['job_due_date']).dt.tz_localize('UTC')

And another variant based off of SO answers:

end_user_df['due_date'] = pd.to_datetime(end_user_df['job_due_date']).dt.tz_localize('UTC') 
end_user_df['due_date'] = end_user_df['due_date'].to_string().strftime("%Y-%m-%dT%H:%M:%S%Z")

What should I try next?


Solution

  • In principle, you'd want to convert datetime represented as string to datetime data type for working with the data. To generate an output (string) with a certain format, you'd use the strftime method of the dt namespace. Note that for strftime, you have to use a literal 'Z' to indicate UTC (not %Z).

    Complementing an answer to a similar question, here's an example how you can obtain different precisions of fractional seconds in the ISO format output.

    import pandas as pd
    
    df = pd.DataFrame({"date": ["2023-07-26T11:04:23.893001Z"]})
    
    # for working with the data, convert to datetime data type
    df["date_dt"] = pd.to_datetime(df["date"])
    
    print(df.dtypes)
    # date                    object
    # date_dt    datetime64[ns, UTC]
    # dtype: object
    
    # microsecond precision fractional seconds: just use %f
    print(df["date_dt"].dt.strftime("%Y-%m-%dT%H:%M:%S.%fZ"))
    # 0    2023-07-26T11:04:23.893001Z
    # Name: date_dt, dtype: object
    
    # millisecond precision fractional seconds: truncate output and add 'Z'
    print(df["date_dt"].dt.strftime("%Y-%m-%dT%H:%M:%S.%f").str[:-3] + "Z")
    # 0    2023-07-26T11:04:23.893Z
    # Name: date_dt, dtype: object
    
    # no fractional seconds at all: no %f
    print(df["date_dt"].dt.strftime("%Y-%m-%dT%H:%M:%SZ"))
    # 0    2023-07-26T11:04:23Z
    # Name: date_dt, dtype: object