Search code examples
sqlpostgresqlpostgispolygonintersect

Update a polygon that have intersections in two or more polygons


I have 2 tables, table #1 contains a square polygon/grid, table #2 contains the polygon of an administrative area. I updated table #1 with data from table #2 with the ST_Intersects() function. Here's an example of my query:

update table_grid 
set data1 = table_administrative.data1 
from table_administrative 
where ST_Intersects(table_grid.geom, table_administrative.geom)

How will the query run? and what about grids that have intersections in two or more administrative regions? Will it be chosen randomly, or the largest region, or based on the centroid?

An example of the case is as shown in the picture

example grid that intersect two areas


Solution

  • What about grids that have intersections in two or more administrative regions? Will it be chosen randomly, or the largest region, or based on the centroid?

    Randomly, due to how update..from works:

    When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.

    Even if you see some repeatability in which ones it ends up using, it's unreliable. That being said, you can correct your query to explicitly select the specific one you want, based on your own criteria, like the proximity of centroid, overlapping area, or just picking the one with lower/earlier id, alphabetically.

    Add a subquery with distinct on, order by..limit 1 or row_number()over(). The <<->> bbox centroid distance operator might be useful: demo

    with cte as (select distinct on (tg.ctid) tg.ctid, data1
                 from table_grid tg 
                 join table_administrative ta
                   on ST_Intersects(tg.geom,ta.geom)
                 order by tg.ctid, tg.geom <<->> st_centroid)
    update table_grid as this
    set data1 = cte.data1
    from cte
    where this.ctid=cte.ctid
    

    You might also be interested in ST_Subdivide(). If you subdivide the administrative areas with it, all these joins (and your update that joins implicitly) will be way faster. It'll also be a bit safer to downgrade ST_Intersects() to plain bounding box intersection && for added performance in case you plan to stick with the "direct" update, not minding the randomly resolved ties:

    update __test.table_grid 
    set data1 = table_administrative.data1 
    from __test.table_administrative
    where table_grid.geom && table_administrative.geom;