Search code examples
sqljoingroup-bysubquery

Group By Count from Multiple Tables with conditions


I have 3 different tables, country, city, and customer. Tables are shown below:

country table:

id   country_name
1    UK
2    US
3   Brazil
:    
n   Canada

city table

id  city_name  postal_code  country_id
1    London        30090         1
2    Dallas        20909         2
3    Rio           29090         3
4    Atlanta       30318         2
:
n    Vancouver     32230         n

customer table

id    customer_name     city_id
1        John             1
2        Pete             3
3        Dave             2
4        May              2
5        Chuck            4
6        Sam              3
7        Henry            3

***country.id is references city.country_id, and city.id is references customer.city_id

I want to write a query that can extract the country name, city name and the count of the customer of the associate city. But with one condition, the query will return all cities with more customers than the average number of customers of all cities

It will look something like below, this is the correct output

UK London 2
Brazil Rio 3

but I kept getting this output, which isn't correct

UK London 2
US Dallas 2
US Atlanta 1
Brazil Rio 3

I wrote a SQL query but it doesn't give me the result that I want

SELECT country.country_name, city.city_name, COUNT(customer.city_id) from country 
JOIN city on country.id = city.country_id 
JOIN customer on city.id = customer.city_id
Group by city_name,country.country_name;

I am wondering how can I do this and fix my code?


Solution

  • You need to nest your query into a subquery so that you can take the average value of the counts and compare the current count with it. If you're using an SQL that supports CTEs, you can use one e.g.

    WITH cnts AS (
      SELECT country.country_name, city.city_name, COUNT(customer.city_id) AS cnt 
      FROM country 
      JOIN city on country.id = city.country_id 
      JOIN customer on city.id = customer.city_id
      GROUP BY city_name,country.country_name
    )
    SELECT *
    FROM cnts
    WHERE cnt > (SELECT AVG(cnt) FROM cnts)
    

    Otherwise the query becomes more complex with the main query required as a subquery in the WHERE clause as well:

    SELECT country.country_name, city.city_name, COUNT(customer.city_id) AS cnt 
    FROM country 
    JOIN city on country.id = city.country_id 
    JOIN customer on city.id = customer.city_id
    GROUP BY city_name,country.country_name
    HAVING COUNT(customer.city_id) > (SELECT AVG(cnt) FROM (
      SELECT country.country_name, city.city_name, COUNT(customer.city_id) AS cnt 
      FROM country 
      JOIN city on country.id = city.country_id 
      JOIN customer on city.id = customer.city_id
      GROUP BY city_name,country.country_name
    ) cnts2)
    

    In both cases the output for your sample data is:

    country_name    city_name   cnt
    Brazil          Rio         3
    US              Dallas      2
    

    Demo on dbfiddle