postgresql postgis geospatial spatial spatial-query

Using the ST_Disjoint() Function gives unexpected result

I am fiddeling around with this dataset http://s3.cleverelephant.ca/postgis-workshop-2020.zip. It is used in this workshop http://postgis.net/workshops/postgis-intro/spatial_relationships.html.

I want to identify all the features, that do not have a subway station. I thought this spatial join is rather straight forward

SELECT
  census.boroname,
  COUNT(census.boroname)
FROM nyc_census_blocks AS census
    JOIN nyc_subway_stations AS subway
        ON ST_Disjoint(census.geom, subway.geom)
GROUP BY census.boroname;

However, the result set is waaaaay to large.

"Brooklyn"  4753693
"Manhattan" 1893156
"Queens"    7244123
"Staten Island" 2473146
"The Bronx" 2683246

When I run a test

SELECT COUNT(id) FROM nyc_census_blocks;

I get 38794 as a result. So there are way less features in nyc_census_blocks than I have in the result-set from the spatial join.

Why is that? Where is the mistake I am making?

Solution

The problem is that with ST_Disjoint you're getting for every record of nyc_census_block the total number of stations that are disjoint with nyc_subway_stations, which means in case of no intersection all records of nyc_subway_stations (491). That's why you're getting such a high count.

Alternatively you can count how many subways and census blocks do intersect, e.g. in a CTE or subquery, and in another query count how many of them return 0:

WITH j AS (
 SELECT 
   gid,census.boroname,
  (SELECT count(*) 
   FROM nyc_subway_stations subway
   WHERE ST_Intersects(subway.geom,census.geom)) AS qt
 FROM nyc_census_blocks AS census
)
SELECT boroname,count(*)
FROM j WHERE qt = 0
GROUP BY boroname;

   boroname    | count 
---------------+-------
 Brooklyn      |  9517
 Manhattan     |  3724
 Queens        | 14667
 Staten Island |  5016
 The Bronx     |  5396
(5 rows)