Search code examples
pythonpandasdatetimetime-seriestimezone

How to convert multiple time zone column to UTC in Python


I have a dataset where the date_time column contains a mixture of BST and GMT date and times, in the following format 'Sun 27 Mar 2022 12:59:03 AM GMT'. I would like to convert this whole column into the following format '2022-03-27 00:59:03', and into one standard time zone, such as UTC. Where am I going wrong??

I tried using strptime() to convert the a tester date, but that only works with GMT times.

date_time_str = 'Sun 27 Mar 2022 12:59:03 AM GMT'
date_time_obj = datetime.datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
print('Date-time:', date_time_obj)

Date-time: 2022-03-27 00:59:03

Then tried with BST

date_time_str_bst = 'Sun 27 Mar 2022 02:00:02 AM BST'
date_time_obj_bst = datetime.datetime.strptime(date_time_str_bst, '%a %d %b %Y %I:%M:%S %p %Z')

ValueError: time data 'Sun 27 Mar 2022 02:00:02 AM BST' does not match format '%a %d %b %Y %I:%M:%S %p %Z'

enter image description here


Solution

  • Here's an option how you can handle this. As commented, abbreviated tz names such as "BST" are ambiguous. You will have to define to which tz the abbreviations map; e.g. "Europe/London" for BST/GMT.

    define timezone mapping {abbreviation --> IANA tz name}, apply parser, then convert to UTC

    given

    df
                         Date and time  temp
    0  Sun 27 Mar 2022 12:57:03 AM GMT  33.9
    1  Sun 27 Mar 2022 12:58:02 AM GMT  33.6
    2  Sun 27 Mar 2022 12:59:03 AM GMT  33.6
    3  Sun 27 Mar 2022 02:00:02 AM BST  33.9
    4  Sun 27 Mar 2022 02:01:03 AM BST  33.6
    5  Sun 27 Mar 2022 02:02:02 AM BST  33.6
    

    running

    import pandas as pd
    import dateutil
    
    tzmapping = {"GMT": dateutil.tz.gettz("Europe/London"),
                 "BST": dateutil.tz.gettz("Europe/London")}
    
    df["dt_UTC"] = df["Date and time"].apply(dateutil.parser.parse, tzinfos=tzmapping).dt.tz_convert("UTC")
    

    gives

    df
                         Date and time  temp                    dt_UTC
    0  Sun 27 Mar 2022 12:57:03 AM GMT  33.9 2022-03-27 00:57:03+00:00
    1  Sun 27 Mar 2022 12:58:02 AM GMT  33.6 2022-03-27 00:58:02+00:00
    2  Sun 27 Mar 2022 12:59:03 AM GMT  33.6 2022-03-27 00:59:03+00:00
    3  Sun 27 Mar 2022 02:00:02 AM BST  33.9 2022-03-27 01:00:02+00:00
    4  Sun 27 Mar 2022 02:01:03 AM BST  33.6 2022-03-27 01:01:03+00:00
    5  Sun 27 Mar 2022 02:02:02 AM BST  33.6 2022-03-27 01:02:02+00:00