Search code examples
postgresqlleft-joinpostgisr-sf

What is the fastest way to join two large postgresql tables on geometry using st_contains in postgresql?


I'm trying to join two relatively large tables on geometry. Geometry A consists of large polygons. Geometry B consists of smaller polygons. I want to left join the rows where geometry A contains geometry B. The geometry B polygons may be on the edge of geometry A but not outside of geometry A. Both geometries need to be on srid = 28992, which they currently are not. I find that the below code is extremely slow even with both geometries indexed. What would be a usefull way to speed up the process?

    SELECT ST_GeomFromText(
             ST_AsText(
               CAST(rek.geometry AS TEXT)
             ), 28992
           ), *
    FROM rek
    left join pvc
    on st_contains(rek.geometry, st_transform(pvc.geometry, 28992))

DDL

for table A

-- Drop table
-- DROP TABLE rek;

CREATE TABLE rek (
    geometry geometry NULL,
    perceel varchar NULL,
    naam varchar NULL
);
CREATE INDEX campagne_klanten_geom_idx_new ON dl_eigenaarsdata.valid_geom_final_kadaster USING gist (geometry);
CREATE INDEX geom_idx_new ON dl_eigenaarsdata.valid_geom_final_kadaster USING gist (geometry);

for table B

-- Drop table
-- DROP TABLE pvc;

CREATE TABLE pvc (
    id varchar(16) NULL,
    geometry geometry(POLYGON, 28992) NULL,
);
CREATE INDEX pand_view_clean_geometry_idx ON dl_deeb.pand_view_clean USING gist (geometry);
CREATE INDEX pand_view_clean_gm_wk_bu_naam_idx ON dl_deeb.pand_view_clean USING btree (gm_naam, wk_naam, bu_naam);
CREATE INDEX pand_view_clean_id_idx ON dl_deeb.pand_view_clean USING btree (id);
CREATE INDEX pand_view_clean_pandtype_idx ON dl_deeb.pand_view_clean USING btree (pand_type);
CREATE INDEX pand_view_clean_postcode_idx ON dl_deeb.pand_view_clean USING btree (postcode);

Solution

  • st_contains will not use the geometry index because the geometry field is not used directly.

    You would need to create an index on the transformed geometry:

    CREATE INDEX transformed_geometry_idx ON pvc USING gist (st_transform(geometry, 28992));
    

    Also the main select is unnecessarily too complicated. It could be replaced by select st_setSrid(rek.geometry, 28992)