Sum of a column excluding duplicate join records

Let's say I have a table containing cities with their populations and a table containing buildings of different types. I want to know the percentage of people living in cities containing schools.

Cities:

name	pop
A	10
B	100
C	1000

Buildings:

is_school	city
false	A
true	B
true	B
true	C

If I do somthing like:

SELECT
  SUM(CASE WHEN building.is_school = true THEN city.pop ELSE 0 END) school,
  SUM(city.pop) total
FROM city
LEFT JOIN building ON building.city = city.name;

I get city B population summed twice. I would like to have:

school	total
1100	1110

but I get:

school	total
1200	1210

fiddle

I can do a subquery:

SELECT
  SUM(CASE WHEN city.name in (
    SELECT city.name
    FROM city
    LEFT JOIN building ON building.city = city.name
    WHERE building.is_school = true
  ) THEN city.pop ELSE 0 END),
  SUM(city.pop)
FROM city;

but it feels unnecessary convoluted considering what I'm trying to achieve, is there really no other way ?

Solution

You can do:

SELECT
  1.0 * sum(case when exists (
  select 1 from building b where b.city = c.name and is_school) then 1 else 0 end)
  / count(*)
FROM city c;

Result:

?column?
--------
0.66666666666666666667

Or, you can use a lateral join:

select 1.0 * sum(coalesce(n, 0)) / count(*) as p
from city c
left join lateral (
  select 1 from building b where b.city = c.name and b.is_school limit 1
) x (n) on true

See running examples at db<>fiddle.