I am working with a table similar to the following:
week | Time | Person | Direction | Distance |
---|---|---|---|---|
1 | 12:00 | A | 27 | 4 |
1 | 12:00 | A | 27 | 6 |
1 | 12:00 | A | 27 | 8 |
1 | 12:00 | B | 20 | 2 |
1 | 12:00 | B | 20 | 5 |
1 | 12:00 | B | 20 | 7 |
1 | 12:00 | C | 17 | 3 |
1 | 12:00 | C | 17 | 4 |
1 | 12:00 | C | 17 | 6 |
1 | 1:00 | A | 3 | 9 |
1 | 1:00 | A | 3 | 7 |
1 | 1:00 | A | 3 | 5 |
1 | 1:00 | B | 6 | 3 |
1 | 1:00 | B | 6 | 4 |
1 | 1:00 | B | 6 | 8 |
1 | 1:00 | C | 12 | 10 |
1 | 1:00 | C | 12 | 9 |
1 | 1:00 | C | 12 | 14 |
As you can see, within each value of Time
there are multiple observations for each Person
. Every person has the same value of Direction
in all their observations at a given time, but different values of Distance
. For each value of Time
I want to select only observation with the smallest Distance
for the person who maximizes Direction
.
I also want to create columns for the person with second highest Direction
value and their minimum Distance.
So for each value of time I will only return one observation. Here's what I want to return:
week | Time | max_direction_person | max_person_min_distance | second_max_direction_person | second_max_person_min_distance |
---|---|---|---|---|---|
1 | 12:00 | A | 4 | B | 2 |
1 | 1:00 | C | 9 | B | 3 |
How is this possible? I have tried group by subqueries, but haven't had any success.
You can use window functions for this. DENSE_RANK
will identify the ordering of each set of Person
rows, and then ROW_NUMBER
can number within those sets.
Then you filter to the rows which have a dense-rank of 1 or 2, and a row-number of 1, and simply pivot it using MIN
or MAX
to get a grouped result per Time
.
WITH Maxed AS (
SELECT *,
dr = DENSE_RANK() OVER (PARTITION BY Week, Time ORDER BY Direction DESC),
rn = ROW_NUMBER() OVER (PARTITION BY Week, Time, Direction, Person ORDER BY Distance ASC)
FROM Observation o
)
SELECT
Week,
Time,
max_direction_person = MIN(CASE WHEN dr = 1 THEN Person END),
max_person_min_distance = MIN(CASE WHEN dr = 1 AND rn = 1 THEN Distance END),
second_max_direction_person = MIN(CASE WHEN dr = 2 THEN Person END),
second_max_person_min_distance = MIN(CASE WHEN dr = 2 AND rn = 1 THEN Distance END)
FROM Maxed
WHERE dr >= 1 AND dr <= 2
AND rn = 1
GROUP BY
Week,
Time;
Note that the addition of Direction
in the ROW_NUMBER
partitioning is technically unnecessary, however it will be more efficient due to the existing sort for the DENSE_RANK
calculation.