I've joined 2 tables and extract the top 2 vendors by country (code at the bottom of this post and link) with the help of Tim Biegeleisen on this link , now I would like to go 1 step further and group the top 2 vendors by year by country.
The original ORDERS table looks like this :
country_name | date_local | vendor_id | gmv_local | is_successful_order |
---|---|---|---|---|
Taiwan | 2012-10-02 | 2870 | 559.6 | true |
Taiwan | 2012-10-02 | 3812 | 573.5 | true |
Singapore | 2012-10-02 | 941 | 778.6 | true |
Singapore | 2014-10-02 | 13 | 120.6 | true |
Bangkok | 2014-10-02 | 227 | 563.6 | true |
This table is merged with the vendor table
id | vendor_name | country_name |
---|---|---|
2870 | C House | Taiwan |
941 | A House | Singapore |
227 | 9 House | Bangkok |
I would like to extract the year from the "date_local" column into timestamp format, where the items in the Year column will look like "2012-01-01T00:00:00" from the original date format of "2012-10-02"
Then I would like to list out the top 2 vendors of by year of each country in total revenue
The resulting table should look like this:
year | country_name | vendor_name | total_gmv |
---|---|---|---|
2012-01-01T00:00:00 | Singapore | A House | 1119.76 |
2012-01-01T00:00:00 | Singapore | B House | 819.63 |
2012-01-01T00:00:00 | Taiwan | C House | 119.6 |
2012-01-01T00:00:00 | Taiwan | D House | 119.6 |
2012-01-01T00:00:00 | Bangkok | 9 House | 119.6 |
2014-01-01T00:00:00 | Singapore | A House | 2119.76 |
2014-01-01T00:00:00 | Singapore | B House | 1819.63 |
2014-01-01T00:00:00 | Taiwan | C House | 1019.6 |
2014-01-01T00:00:00 | Taiwan | D House | 919.6 |
2014-01-01T00:00:00 | Bangkok | 9 House | 189.6 |
I know I need to add one more partition of the Extract( year FROM date_local) as year
and set it to the correct GETDATE()
timestamp format to add the extra layer needed, but I don't know how to add it to the code
WITH cte AS (
SELECT Ord.country_name, vn.vendor_name, ROUND(SUM(Ord.gmv_local), 2) AS total_gmv,
ROW_NUMBER() OVER (PARTITION BY Ord.country_name
ORDER BY SUM(Ord.gmv_local) DESC) rn
FROM `Orders` AS Ord
LEFT JOIN `Vendors` AS vn
ON Ord.vendor_id = vn.id
GROUP BY Ord.country_name, vn.vendor_name
)
SELECT country_name, vendor_name, total_gmv
FROM cte
WHERE rn <= 1
ORDER BY country_name, total_gmv DESC;
If I understand correctly, you only need to also aggregate by year, and then add year to the partition of the call to ROW_NUMBER
:
SELECT
Ord.country_name,
vn.vendor_name,
EXTRACT(year FROM date_local) AS year,
ROUND(SUM(Ord.gmv_local), 2) AS total_gmv
FROM ORDERS AS Ord
LEFT JOIN `primeval-falcon-306603.foodpanda_BI_Exercise.Vendors` AS vn
ON Ord.vendor_id = vn.id
GROUP BY
Ord.country_name,
vn.vendor_name,
EXTRACT(year FROM date_local)
QUALIFY ROW_NUMBER() OVER (PARTITION BY country_name, EXTRACT(year FROM date_local)
ORDER BY total_gmv DESC) <= 2
ORDER BY
Ord.country_name DESC,
total_gmv DESC;