Search code examples
hivehiveql

Obtain corresponding value to max value of another column


I need to find the corresponding value to the max value of another column.

My data is as below:

group subgroup subgroup_2 value_a value_b date
A 101 1 200 101 20220301
A 102 1 105 90 20220301
A 103 2 90 202 20220301
A 211 2 75 107 20220301
B 212 1 91 65 20220301
B 213 1 175 101 20220301

I would need to format the data like this:

group subgroup_2 max_value_a value_b date
A 1 200 101 20220301
A 2 90 202 20220301
B 1 175 101 20220301

I can achieve the format fairly easily via a group by, however I have to aggregate value_b to do this which doesn't give me the result I need.

I know I can use rank() over partition by but it doesn't seem to provide the format I require.

This is the query I used below, however it only provides the max of one subgroup_2 rather than the max of each:

select group, subgroup_2, max_value_a, value_b, date
from
(
select a.group, a.subgroup_2, a.max_value_a, a.value_b, a.date,
       rank() over(partition by a.group, subgroup_2, a.date order by a.max_value_a desc) as rnk
  from table_1 a
)s
where rnk=1

Solution

  • You want to use ROW_NUMBER here:

    SELECT group, subgroup_2, value_a AS max_value_a, value_b, date
    FROM
    (
        SELECT group, subgroup_2, value_a, value_b, date,
               ROW_NUMBER() OVER (PARTITION BY group, subgroup_2 ORDER BY value_a DESC) rn
        FROM table_1
    ) t
    WHERE rn = 1;