Search code examples
sqlgoogle-bigquery

How to separate a string and convert it to date in Google Big Query


I have a column full of timestamps titled "DateTime" that are actually a string data type. While most of the entries are in this first format: 01012023050000AM (meaning January 1st 2023 at 5 AM EST), some of the entries are in this second format: 2023-01-01 05:00:00+00 (also January 1st 2023 at 5 AM EST).

I am trying to convert this string data into date data so that I can find the difference in values from another column between the first and last day of each month throughout 2023. I tried using PARSE_TIMESTAMP("%c", DateTime) but got an error for some of the dates such as Failed to parse input string "09042023050000AM

The timestamp is taken at 5 AM each day so the hours-minutes-second portion is not needed, and most of the timestamps under the second format date further back then needed. I am wondering how do I get rid of the "050000AM" and how do I exclude any dates of the second format "2023-01-01 05:00:00+00" before converting to a date?

I apologize I am brand new to SQL and Google Big Query so any help is appreciated!


Solution

  • If there's only two formats, you could use coalesce() + safe.parse_timestamp(), as in the example below:

    select
      DateTime,
      coalesce(
        safe.parse_timestamp('%Y-%m-%d %H:%M:%S%z', DateTime),
        safe.parse_timestamp('%m%d%Y%H%M%S%p', DateTime)
      ) as my_new_datetime
    

    which will return:

    | DateTime               | my_new_datetime         |
    |------------------------|-------------------------|
    | 01012023050000AM       | 2023-01-01 05:00:00 UTC |
    | 2023-01-01 05:00:00+00 | 2023-01-01 05:00:00 UTC |
    

    Using this combination, you will make sure that:

    • if a new format is added, it will return null instead of an error
    • when a new format is added, if at all, you will just need to add a new line in the coalesce()