Search code examples
sqlsql-servert-sqlgreatest-n-per-grouprow-number

How to select the minimum value within a group that maximizes another value?


I am working with a table similar to the following:

week Time Person Direction Distance
1 12:00 A 27 4
1 12:00 A 27 6
1 12:00 A 27 8
1 12:00 B 20 2
1 12:00 B 20 5
1 12:00 B 20 7
1 12:00 C 17 3
1 12:00 C 17 4
1 12:00 C 17 6
1 1:00 A 3 9
1 1:00 A 3 7
1 1:00 A 3 5
1 1:00 B 6 3
1 1:00 B 6 4
1 1:00 B 6 8
1 1:00 C 12 10
1 1:00 C 12 9
1 1:00 C 12 14

As you can see, within each value of Time there are multiple observations for each Person. Every person has the same value of Direction in all their observations at a given time, but different values of Distance. For each value of Time I want to select only observation with the smallest Distance for the person who maximizes Direction.

I also want to create columns for the person with second highest Direction value and their minimum Distance.

So for each value of time I will only return one observation. Here's what I want to return:

week Time max_direction_person max_person_min_distance second_max_direction_person second_max_person_min_distance
1 12:00 A 4 B 2
1 1:00 C 9 B 3

How is this possible? I have tried group by subqueries, but haven't had any success.


Solution

  • You can use window functions for this. DENSE_RANK will identify the ordering of each set of Person rows, and then ROW_NUMBER can number within those sets.

    Then you filter to the rows which have a dense-rank of 1 or 2, and a row-number of 1, and simply pivot it using MIN or MAX to get a grouped result per Time.

    WITH Maxed AS (
        SELECT *,
          dr = DENSE_RANK() OVER (PARTITION BY Week, Time ORDER BY Direction DESC),
          rn = ROW_NUMBER() OVER (PARTITION BY Week, Time, Direction, Person ORDER BY Distance ASC)
        FROM Observation o
    )
    SELECT
      Week,
      Time,
      max_direction_person           = MIN(CASE WHEN dr = 1 THEN Person END),
      max_person_min_distance        = MIN(CASE WHEN dr = 1 AND rn = 1 THEN Distance END),
      second_max_direction_person    = MIN(CASE WHEN dr = 2 THEN Person END),
      second_max_person_min_distance = MIN(CASE WHEN dr = 2 AND rn = 1 THEN Distance END)
    FROM Maxed
    WHERE dr >= 1 AND dr <= 2
      AND rn = 1
    GROUP BY
      Week,
      Time;
    

    db<>fiddle

    Note that the addition of Direction in the ROW_NUMBER partitioning is technically unnecessary, however it will be more efficient due to the existing sort for the DENSE_RANK calculation.