Search code examples
sqlgoogle-bigquery

BigQuery - Sorting a Datetime string


I have a column of datatype String. Eg: 2025-01-20T23:38:31.8223598Z

If I apply a ORDER BY on this column inside a window function as below:

ROW_NUMBER() OVER (PARTITION BY id ORDER BY modifiedOn DESC) AS rank

Will the sorting be actually based on the DateTime or as a String. Do I need to explicitly convert this to a DateTime/Timestamp before ordering.

Any insights on this please.


Solution

  • If your column has all the values in the format like your example (YYYY-MM-DDTHH:MM:SS...), it will be chronological sorted right. It's because each part of the datetime is already ordered by itself, a string with year 2020 come first than a string with year 2022. So is based on the largest to smallest unit even if it's string.

    I do recommend you to do it explicity and convert, because of data quality, consistency, optimization, etc. You could use PARSE_TIMESTAMP:

    ROW_NUMBER() OVER (
      PARTITION BY id 
      ORDER BY PARSE_TIMESTAMP('%Y-%m-%dT%H:%M:%E*SZ', modifiedOn) DESC
    ) AS rank
    

    Let me know if you need more details or have any doubt!

    Resources: https://www.ionos.com/digitalguide/websites/web-development/iso-8601/, https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#parse_timestamp