I have a column of datatype String. Eg: 2025-01-20T23:38:31.8223598Z
If I apply a ORDER BY
on this column inside a window function as below:
ROW_NUMBER() OVER (PARTITION BY id ORDER BY modifiedOn DESC) AS rank
Will the sorting be actually based on the DateTime or as a String. Do I need to explicitly convert this to a DateTime/Timestamp before ordering.
Any insights on this please.
If your column has all the values in the format like your example (YYYY-MM-DDTHH:MM:SS...), it will be chronological sorted right. It's because each part of the datetime is already ordered by itself, a string with year 2020 come first than a string with year 2022. So is based on the largest to smallest unit even if it's string.
I do recommend you to do it explicity and convert, because of data quality, consistency, optimization, etc. You could use PARSE_TIMESTAMP:
ROW_NUMBER() OVER (
PARTITION BY id
ORDER BY PARSE_TIMESTAMP('%Y-%m-%dT%H:%M:%E*SZ', modifiedOn) DESC
) AS rank
Let me know if you need more details or have any doubt!
Resources: https://www.ionos.com/digitalguide/websites/web-development/iso-8601/, https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#parse_timestamp