Search code examples
sqloracle-databasecountsql-order-byinner-join

Selecting rows with the most repeated values at specific column


Problem in general words: I need to select value from one table referenced to the most repeated values in another table.

Tables have this structure: screenshot screenshot2

The question is to find country which has the most results from sportsmen related to it.

First, INNER JOIN tables to have relation between result and country

SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id);

Then, I count how much time each country appear

SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
    INNER JOIN sportsman USING (sportsman_id))
GROUP BY country
;

And got this screenshot3

Now it feels like I'm one step away from solution )) I guess it's possible with one more SELECT FROM (SELECT ...) and MAX() but I can't wrap it up?

ps: I did it with doubling the query like this but I feel like it's so inefficient if there are millions of rows.

SELECT country 
    FROM (SELECT country, COUNT(country) AS highest_participation
        FROM (SELECT competition_id, country FROM result 
            INNER JOIN sportsman USING (sportsman_id) 
            ) GROUP BY country 
        ) 
WHERE highest_participation = (SELECT MAX(highest_participation)  
    FROM (SELECT country, COUNT(country) AS highest_participation
        FROM (SELECT competition_id, country FROM result 
            INNER JOIN sportsman USING (sportsman_id) 
            ) GROUP BY country 
        ))

Also I did it with a view

CREATE VIEW temp AS 
    SELECT country as country_with_most_participations, COUNT(country) as country_participate_in_#_comp 
    FROM( 
        SELECT country, competition_id FROM result 
        INNER JOIN sportsman USING(sportsman_id)
        ) 
    GROUP BY country;
SELECT country_with_most_participations FROM temp 
WHERE country_participate_in_#_comp = (SELECT MAX(country_participate_in_#_comp) FROM temp);

But not sure if it's easiest way.


Solution

  • If I understand this correctly you want to rank the countries per competition count and show the highest ranking country (or countries) with their count. I suggest you use RANK for the ranking.

    select country, competition_count
    from
    (
      select 
        s.country,
        count(*) as competition_count,
        rank() over (order by count(*) desc) as rn
      from sportsman s
      inner join result r using (sportsman_id)
      group by s.country
    ) ranked_by_count
    where rn = 1
    order by country;
    

    If the order of the result rows doesn't matter, you can shorten this to:

    select s.country, count(*) as competition_count      
    from sportsman s
    inner join result r using (sportsman_id)
    group by s.country
    order by count(*) desc
    fetch first rows with ties;