Search code examples
mysqlsqlgreatest-n-per-group

SQLZOO- using GROUPBY to find the largest country in a continent; is this possible?


I'm working on a practice problem from SQLZOO, and am not sure why the solution I'm trying doesn't work as it makes sense to me.

This is the format of the table::

 -------------------------------------------------------------
|     name      continent    area    population       gdp     |
|-------------------------------------------------------------|
| Afghanistan     Asia      652230    25500100    20343000000 |
| .                                                           |
| .                                                           |
| .                                                           |
|                                                             |
 -------------------------------------------------------------

The question is the following:

Find the largest country (by area) in each continent, show the continent, the name and the area.

Here is the way I was thinking to solve it:

SELECT continent, name, area 
  FROM world
 WHERE name IN (SELECT continent, name, MAX(area) 
                  FROM world 
                 GROUP BY continent);

I know this doesn't work, but why not? It seems like the nested SELECT statement is finding the country with the MAX area per continent, is it not?

The actual solution for this is something like follows:

SELECT continent, name, area 
  FROM world x
 WHERE area >= ALL
    (SELECT area 
       FROM world y
      WHERE y.continent=x.continent
        AND area>0)

But this seems like a complicated way of coming up with it;; is this way makes the most sense? Any ideas are appreciated

Thank you in advance!!


Solution

  • While at a quick glimpse this query seems works

    SELECT continent, name, area 
      FROM world
     WHERE area IN (SELECT MAX(area) 
                      FROM world 
                     GROUP BY continent);
    

    Demo 1

    considering the current data, some issues would raise while some other new records added such as in the demo below. Rather than the above prefer this one :

    SELECT w1.continent, name, w1.area 
      FROM world AS w1
      JOIN (SELECT continent, MAX(area) AS area
              FROM world 
             GROUP BY continent) AS w2
        ON w1.continent = w2.continent
       AND w1.area = w2.area
    

    Demo 2