Search code examples
sqlpostgresqlcountsql-order-bygreatest-n-per-group

Column must appear in group by or aggregate function in nested query


I have the following table.

Fights (fight_year, fight_round, winner, fid, city, league)

I am trying to query the following:

For each year that appears in the Fights table, find the city that held the most fights. For example, if in year 1992, Jersey held more fights than any other city did, you should print out (1992, Jersey)

Here's what I have so far but I keep getting the following error. I am not sure how I should construct my group by functions.

ERROR: column, 'ans.fight_round' must appear in the GROUP BY clause or be used in an aggregate function. Line 3 from (select *

select fight_year, city, max(*)
from (select *
    from (select *
        from fights as ans
        group by (fight_year)) as l2
     group by (ans.city)) as l1;

Solution

  • In Postgres, I would recommend aggregation and distinct on:

    select distinct on (flight_year) flight_year, city, count(*) cnt
    from flights
    group by flight_year, city
    order by flight_year, count(*) desc
    

    This counts how many fights each city had each year, and retains the city with most fight per year.

    If you want to allow ties, then use window functions:

    select flight_year, city, cnt
    from (
        select flight_year, city, count(*) cnt,
            rank() over(partition by flight_year order by count(*) desc) rn
        from flights
        group by flight_year, city
    ) f
    where rn = 1