Search code examples
sqlpostgresqlsubquerycorrelated-subquery

Subqueries using min/max


Some countries have populations more than three times that of any of their neighbours (in the same continent). Give the countries and continents.

enter image description here

From https://sqlzoo.net/wiki/SELECT_within_SELECT_Tutorial, question 10

I tried this:

SELECT w.name, w.continent 
FROM world w
WHERE w.population/3 > (SELECT min(w2.population) 
                      FROM world w2
                      WHERE w2.continent = w.continent);

This solution seems to work:

SELECT w.name, w.continent 
FROM world w
WHERE w.population > (SELECT 3 * MAX(w2.population) 
                      FROM world w2
                      WHERE w2.continent = w.continent AND
                            w2.name <> w.name
                     );

I'm trying to understand the logical flaw in my solution.

Why is max() and not min() used since we would want to compare if the population is 3 times the least populated country in the continent?

Do we need w2.name <> w.name in min/max scenarios. Why?


Solution

  • more than three times that of any of their neighbours

    Meaning, the second highest population on the same continent must be less than a third, and only one country per continent can qualify. (The minimum population on the continent is not relevant.)

    The second query takes all countries on the same continent except the one in the outer query (w2.name <> w.name), finds the most populous and multiplies by 3. If that's still lower than the population of the country in the outer query, it passes the condition.

    Aside, while being elegant SQL, the query is expensive because the correlated subquery has to be evaluated once for every row. This should be cheaper:

    SELECT *
    FROM  (
       SELECT DISTINCT ON (continent)
              name, continent, population
            , lead(population) OVER (PARTITION BY continent ORDER BY population DESC) AS next_population
       FROM   world
       ORDER  BY continent, population DESC
       ) sub
    WHERE  population > 3 * next_population;
    

    db<>fiddle here