Search code examples
google-bigqueryranking-functions

Rank by DATETIME in BigQuery


I have the following table schema:

Field name      Type        Mode        Policy tags     Description
order_id        STRING      NULLABLE        
customer_id     INTEGER     NULLABLE        
order_datetime  TIMESTAMP   NULLABLE                    %Y-%m-%d %H:%M:%E*S
item_id         STRING      NULLABLE        
order_quantity  INTEGER     NULLABLE    

It is possible for customers to place multiple orders on a single date. I am trying to write a query that ranks order_id for each customer for each date they placed an order:

SELECT customer_id,
order_datetime as order_date,
order_id,
RANK() OVER(PARTITION BY customer_id, CAST(order_datetime AS DATETIME) ORDER BY 2 DESC) as rank
FROM `SQL_sets.orders`

which returns the following:

customer_id     order_date      order_id    rank
21456       2019-01-12 9:28:35      A-005   1
21456       2019-01-12 12:28:35     A-005   1
21456       2019-01-12 19:28:35     A-005   1
31874       2020-11-15 0:00:00      A-009   1
32483       2020-11-21 0:00:00      A-001   1
32483       2020-11-22 0:00:00      A-001   1
42491       2019-01-16 2:52:07      A-006   1
42491       2019-01-20 2:52:07      A-0101  1
42491       2019-12-01 9:52:07      A-007   1
42491       2019-12-01 15:12:07     A-008   1
42491       2020-11-19 0:00:00      A-006   1
55400       2019-01-11 12:52:07     A-0088  1

But I expect/wanted to get:

customer_id     order_date      order_id    rank
21456       2019-01-12 9:28:35      A-005   1
21456       2019-01-12 12:28:35     A-005   2
21456       2019-01-12 19:28:35     A-005   3
31874       2020-11-15 0:00:00      A-009   1
32483       2020-11-21 0:00:00      A-001   1
32483       2020-11-22 0:00:00      A-001   1
42491       2019-01-16 2:52:07      A-006   1
42491       2019-01-20 2:52:07      A-0101  1
42491       2019-12-01 9:52:07      A-007   1
42491       2019-12-01 15:12:07     A-008   2
42491       2020-11-19 0:00:00      A-006   1
55400       2019-01-11 12:52:07     A-0088  1

What am I doing wrong?


Solution

  • You did not say but I am assuming that you are using SQL server. This may be more as to what you want. In your example your partitions were by customer and datetime, so unless the customer ordered two things at exactly the same time the ranking would always be 1. If you are only interested in orders that happen on the same date, you need to cast it to that type...

    SELECT customer_id,
    order_datetime as order_date,
    order_id,
    RANK() OVER(PARTITION BY customer_id ORDER BY CAST(order_datetime AS DATETIME) DESC) rank
    FROM `SQL_sets.orders`