(Using BigQuery SQL) I am trying to find the 10 regions that had the highest prices, and what date the price occurred on. The table has columns for Date, Region, AveragePrice, and other information not relevant to my query. The dataset has an entry for every Region and every day of the year.
My query is:
SELECT
DISTINCT Region,
max(AveragePrice) as highest_price,
date
FROM avocado_data.avocado_prices as avocado_prices
GROUP BY
avocado_prices.Region, date
ORDER BY highest_price DESC
LIMIT 10
The results I get are:
I would like to see 10 distinct Regions but SF and Raleigh are listed multiple times.
I have tried nesting this query inside the FROM statement of another query to just pull from this table, but I still get the same result.
I can also get the distinct Regions I want if I remove all references to Date in the query, but then I obviously can't see what date the highest price in each Region occurred:
SELECT
DISTINCT Region,
max(AveragePrice) as highest_price,
FROM avocado_data.avocado_prices as avocado_prices
GROUP BY
avocado_prices.Region
ORDER BY highest_price DESC
LIMIT 10
The table above is exactly what I'm looking for, along with the date of that price.
You can use Navigation function FIRST_VALUE for that
Region | AveragePrice | date |
---|---|---|
SanFrancisco | 3.25 | 2022-05-12 |
SanFrancisco | 4.25 | 2022-05-13 |
SanFrancisco | 2.25 | 2022-05-14 |
Tampa | 4.75 | 2022-05-12 |
Tampa | 3.75 | 2022-05-13 |
Tampa | 2.25 | 2022-05-14 |
SELECT
DISTINCT
Region,
MAX(AveragePrice) OVER (PARTITION BY Region) AS highest_price,
FIRST_VALUE(date) OVER (PARTITION BY Region ORDER BY AveragePrice DESC) AS occured_on
FROM table1
Region | highest_price | occured_on |
---|---|---|
SanFrancisco | 4.25 | 2022-05-13 |
Tampa | 4.75 | 2022-05-12 |