I am migrating some PostgreSQL queries to Snowflake and I am struggling to replicate the following query:
WITH test_table(col1, col2, col3, col4) AS (
VALUES
(1, 1, 7, 5),
(1, 1, 6, 4),
(1, 2, 1, 4),
(1, 2, 2, 5),
(1, 2, 3, 6)
)
SELECT
col1,
col2,
(max(ARRAY[col3, col4]))[2]
FROM test_table
GROUP BY 1, 2
This query manages to get the value of col4
that corresponds to the maximum value of col3 grouped by col1, col2
. However, I do not find any elegant solution to replicate this on Snowflake.
You can aggregate an array by any order. So you can group col3
ordered by col4
:
with test_table(col1, col2, col3, col4)
as (
select * from (values
(1, 1, 20, 1), -- added
(1, 1, 7, 5),
(1, 1, 6, 4),
(1, 2, 1, 4),
(1, 2, 2, 5),
(1, 2, 3, 6))
)
SELECT
col1,
col2,
array_agg(col4) within group(order by col3 desc) c4_by_max_c3
FROM test_table
GROUP BY 1, 2
Instead of getting this array, you can just pick the top number with [0]
:
SELECT
col1,
col2,
array_agg(col4) within group(order by col3 desc)[0] c4_by_max_c3
FROM test_table
GROUP BY 1, 2
Hopefully Snowflake will also support max_by
in the future, but it's not yet implemented. Meanwhile this will work.
https://docs.snowflake.com/en/sql-reference/functions/min_by.html