Search code examples
postgresqlpostgisgeospatialspatialspatial-query

Using the ST_Disjoint() Function gives unexpected result


I am fiddeling around with this dataset http://s3.cleverelephant.ca/postgis-workshop-2020.zip. It is used in this workshop http://postgis.net/workshops/postgis-intro/spatial_relationships.html.

I want to identify all the features, that do not have a subway station. I thought this spatial join is rather straight forward

SELECT
  census.boroname,
  COUNT(census.boroname)
FROM nyc_census_blocks AS census
    JOIN nyc_subway_stations AS subway
        ON ST_Disjoint(census.geom, subway.geom)
GROUP BY census.boroname;

However, the result set is waaaaay to large.

"Brooklyn"  4753693
"Manhattan" 1893156
"Queens"    7244123
"Staten Island" 2473146
"The Bronx" 2683246

When I run a test

SELECT COUNT(id) FROM nyc_census_blocks;

I get 38794 as a result. So there are way less features in nyc_census_blocks than I have in the result-set from the spatial join.

Why is that? Where is the mistake I am making?


Solution

  • The problem is that with ST_Disjoint you're getting for every record of nyc_census_block the total number of stations that are disjoint with nyc_subway_stations, which means in case of no intersection all records of nyc_subway_stations (491). That's why you're getting such a high count.

    Alternatively you can count how many subways and census blocks do intersect, e.g. in a CTE or subquery, and in another query count how many of them return 0:

    WITH j AS (
     SELECT 
       gid,census.boroname,
      (SELECT count(*) 
       FROM nyc_subway_stations subway
       WHERE ST_Intersects(subway.geom,census.geom)) AS qt
     FROM nyc_census_blocks AS census
    )
    SELECT boroname,count(*)
    FROM j WHERE qt = 0
    GROUP BY boroname;
    
       boroname    | count 
    ---------------+-------
     Brooklyn      |  9517
     Manhattan     |  3724
     Queens        | 14667
     Staten Island |  5016
     The Bronx     |  5396
    (5 rows)