Search code examples
sqlpostgresqlwindow-functions

Postgresql - optimize the sql to pick top 3 subjects by performance for every student


I have a table containing students and their marks across subjects(multiple evaluations for same subject are there). I want to write a sql to output the result as per below requirement.

  1. Top 3 subjects should be picked for every student and pick 2 rows for every subject.
  2. Pick 3 subjects with highest marks, 2nd row has to be from the same subjects.
  3. There can be same marks in different subjects and same subject as well. Pick any if marks are same.
  4. Subject for a student may not be together and might be scattered across table, i have shown them together for the ease of visualization.

Table:

student_id  |   subject     | marks
------------|---------------|--------------
1           | sub-1         | 10
1           | sub-1         | 50
1           | sub-1         | 25
1           | sub-1         | 50

1           | sub-10        | 2
1           | sub-10        | 85
1           | sub-10        | 40

1           | sub-3         | 10
1           | sub-3         | 5
1           | sub-3         | 55
1           | sub-3         | 65
1           | sub-3         | 70

1           | sub-4         | 90
1           | sub-4         | 50
1           | sub-4         | 25

1           | sub-6         | 20
1           | sub-6         | 70
1           | sub-6         | 35
...

Required result:

student_id  |   subject     | marks
------------|---------------|--------------
1           | sub-4         | 90
1           | sub-4         | 50
1           | sub-10        | 85
1           | sub-10        | 40
1           | sub-6         | 70
1           | sub-6         | 35

I am able to solve it by using the sql mentioned below:

with cte as
(
select * from (
select 
dense_rank() over(partition by s.id order by s.marks desc) dense_rank_number,
row_number() over (partition by s.id, s.subject order by marks desc)  row_num,
 s.*
from
(
    select d.id, d.subject, count(*) 
    from student d
    group by d.id, d.subject
    having count(*) >= 2
) t join student s on t.id = s.id and t.subject = s.subject 
order by 1, 2
) t5
where t5.row_num <= 2
),
cte1 as 
(select e.dense_rank_number, e.row_num, 
e.id, min(e.subject) as subject, e.marks from cte e
where e.row_num = 1 and e.dense_rank_number <= 3
group by e.id, e.row_num, e.marks, e.dense_rank_number

),
cte2 as 
( 
    select cte.* 
    from cte, cte1 
    where 
    cte.id = cte1.id 
    and cte.subject = cte1.subject 
    and cte.row_num != cte1.row_num 
)
select * from cte1
union
select * from cte2
;

Is there a better way to write this sql ? Demo can be found here: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=4d9192b995884d5742977d13e4bbe68d


Solution

  • If I understand correctly, you want the subjects with the three highest marks per student. And then you want the two highest marks in that subject. If so, I would suggest:

    select s.*
    from (select s.*,
                 dense_rank() over (partition by id order by max_marks desc, subject) as seqnum_s
          from (select s.*,
                       row_number() over (partition by id, subject order by marks desc) as seqnum,
                       max(marks) over (partition by id, subject) as max_marks
                from student s
               ) s
          where seqnum <= 2
         ) s
    where seqnum_s <= 3
    order by s.id, max_marks, subject, marks desc;