Search code examples
sqlpostgresqlaverageaggregate-functionsgreatest-n-per-group

Find Nth value and average value corresponding to each group in SQL


I have a table with name and grades for some users

CREATE TABLE grades (name varchar(100), grade integer);
insert into grades
values
('Bob', 12),
('Bob', 13),
('Bob', 23),
('Bob', 17),
('James', 15),
('James', 27),
('Nick ', 18),
('Nick ', 16),
('Nick ', 22),
('Nick ', 32),
('Nick ', 19);

I want an output table grouped by the name, along with the average grade and the nth lowest value for each name.

I tried to use window function nth_value() but I get an error when I try to execute the query (n = 2 here)

select name, avg(grade), nth_value(grade, 2) over(partition by name 
                                      order by grade
                                      Range BETWEEN
                                      UNBOUNDED PRECEDING AND
                                      UNBOUNDED FOLLOWING)
                           
from grades group by name;

Error(s), warning(s):

42803: column "grades.grade" must appear in the GROUP BY clause or be used in an aggregate function

What is the correct way of writing such a query?


Solution

  • Starting from your current attempt, a simple option uses a window average and distinct:

    select distinct
        name, 
        avg(grade) over(partition by name) avg_grade, 
        nth_value(grade, 2) over(
            partition by name 
            order by grade
            range between unbounded preceding and unbounded following
        ) grade_n2
    from grades;
    

    Alternatively, you can rank grades in a subquery, and use conditional aggregation in the outer query:

    select name, avg(grade) avg_grade, max(grade) filter(where rn = 2) grade_n2
    from (
        select g.*, row_number() over(partition by name order by grade) rn
        from grades g
    ) g
    group by name;