Search code examples
sqlselectgoogle-bigquerymaxwhere-clause

(SQL) How do you select a max float value along with other datatypes values within a query?


I'm working with the Iowa Liquor Sales dataset which in this case is called "bigquery-public-data.iowa_liquor_sales.sales". Relevant columns and their datatypes are date(DATE), sale_dollars(FLOAT), item_description(STRING), store_name(STRING).

I am trying to write a query that will return the top sale for each year, of the past three years (2021,2020,2019) along with the date, item_description, and store_name.

The below code works, but only covers one year. I know I could copy+paste and change the date every time but that seems tedious. Is there a better way?

SELECT date, sale_dollars, item_description, store_name
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2021-01-01' and '2021-12-31'
ORDER BY sale_dollars DESC
LIMIT 1
date sale_dollars item_description store_name
2021-04-19 250932.0 Titos Handmade Vodka Hy-Vee #3

When trying different ways to write it so the max sale of 2019,2020, and 2021 return along with their date, item_description, and store_name, I ran into errors. The below is the closest I got (missing date, item_description, and store_name).

SELECT

(SELECT MAX(sale_dollars)
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2021-01-01' and '2021-12-31') as sale_2021,

(SELECT MAX(sale_dollars)
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2020-01-01' and '2020-12-31') as sale_2020,

(SELECT MAX(sale_dollars)
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2019-01-01' and '2019-12-31') as sale_2019

How can I write a query that returns the max sale of the past three years along with it's date, item, and store name?


Solution

  • Consider below query

    SELECT EXTRACT(YEAR FROM date) year, 
           ARRAY_AGG(
             STRUCT(date, sale_dollars, item_description, store_name) 
             ORDER BY sale_dollars DESC LIMIT 1
           )[OFFSET(0)].*
      FROM `bigquery-public-data.iowa_liquor_sales.sales`
     WHERE date BETWEEN '2019-01-01' AND '2021-12-31'
     GROUP BY 1;
    
    Query results
    +------+------------+--------------+----------------------+-------------------------------+
    | year |    date    | sale_dollars |   item_description   |          store_name           |
    +------+------------+--------------+----------------------+-------------------------------+
    | 2020 | 2020-10-08 |     250932.0 | Titos Handmade Vodka | Hy-Vee #3 / BDI / Des Moines  |
    | 2019 | 2019-10-08 |      78435.0 | Makers Mark          | Hy-Vee Food Store / Urbandale |
    | 2021 | 2021-07-05 |     250932.0 | Titos Handmade Vodka | Hy-Vee #3 / BDI / Des Moines  |
    +------+------------+--------------+----------------------+-------------------------------+
    

    or, you can get same result with a window function

     SELECT date, sale_dollars, item_description, store_name
       FROM `bigquery-public-data.iowa_liquor_sales.sales`
      WHERE date BETWEEN '2019-01-01' AND '2021-12-31'
    QUALIFY ROW_NUMBER() OVER (
              PARTITION BY EXTRACT(YEAR FROM date) ORDER BY sale_dollars DESC
            ) = 1;