performance postgresql postgis database-performance

Querying directly is much, much slower than subquery with join

I have 2 tables. Their structure is roughly as follows; I've changed the names, though.

CREATE TABLE overlay_polygon
(
  overlay_polygon_id SERIAL PRIMARY KEY,
  some_other_polygon_id INTEGER REFERENCES some_other_polygon (some_other_polygon_id)
  dollar_value NUMERIC,
  geom GEOMETRY(Polygon,26915)
)

CREATE TABLE point
(
  point_id SERIAL PRIMARY KEY,
  some_other_polygon_id INTEGER REFERENCES some_other_polygon (some_other_polygon_id)
  -- A bunch of other fields that this query won't touch
  geom GEOMETRY(Point,26915)
)

point has a spatial index on its geom column, named spix_point, and an index on its some_other_polygon_id column as well.

There are approximately 500,000 rows in point, and almost all of the rows in point intersect with some row in overlay_polygon. Originally, my overlay_polygon table contained several rows that had very small areas (less than 1 square meter, mostly) and did not spatially intersect any rows from point. After deleting the small rows that do not intersect any rows in point, there are 38 rows.

As the name implies, overlay_polygon is a table of polygons generated as the result of the overlay of polygons from 3 other tables (including some_other_polygon). In particular, I need to make some computations using dollar_value and some columns on point. When I set out to delete the rows that intersected no points to speed up processing in the future, I ended up querying on the COUNT of rows. The most obvious query seemed to be the following.

SELECT op.*, COUNT(point_id) AS num_points
FROM overlay_polygon op
LEFT JOIN point ON op.some_other_polygon_id = point.some_other_polygon_id AND ST_Intersects(op.geom, point.geom)
GROUP BY op.overlay_polygon_id
ORDER BY op.overlay_polygon_id
;

Here is its EXPLAIN (ANALYZE, BUFFERS).

GroupAggregate  (cost=544.45..545.12 rows=38 width=8049) (actual time=284962.944..540959.914 rows=38 loops=1)
  Buffers: shared hit=58694 read=17119, temp read=189483 written=189483
  I/O Timings: read=39171.525
  ->  Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=271754.952..534154.573 rows=415224 loops=1)
        Sort Key: op.overlay_polygon_id
        Sort Method: external merge  Disk: 897016kB
        Buffers: shared hit=58694 read=17119, temp read=189483 written=189483
        I/O Timings: read=39171.525
        ->  Nested Loop Left Join  (cost=0.00..543.46 rows=38 width=8049) (actual time=0.110..46755.284 rows=415224 loops=1)
              Buffers: shared hit=58694 read=17119
              I/O Timings: read=39171.525
              ->  Seq Scan on overlay_polygon op  (cost=0.00..11.38 rows=38 width=8045) (actual time=0.043..153.255 rows=38 loops=1)
                    Buffers: shared hit=1 read=10
                    I/O Timings: read=152.866
              ->  Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) (actual time=50.229..1139.868 rows=10927 loops=38)
                    Index Cond: (op.geom && geom)
                    Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
                    Rows Removed by Filter: 13353
                    Buffers: shared hit=58693 read=17109
                    I/O Timings: read=39018.660
Total runtime: 542172.156 ms

However, I found that this query ran much, much more quickly:

SELECT *
FROM overlay_polygon
JOIN (SELECT op.overlay_polygon_id, COUNT(point_id) AS num_points
      FROM overlay_polygon op
      LEFT JOIN point ON op.some_other_polygon_id = point.some_other_polygon_id AND ST_Intersects(op.geom, point.geom)
      GROUP BY op.overlay_polygon_id
     ) x ON x.overlay_polygon_id = overlay_polygon.overlay_polygon_id
ORDER BY overlay_polygon.overlay_polygon_id
;

Its EXPLAIN (ANALYZE, BUFFERS) is below.

Sort  (cost=557.78..557.88 rows=38 width=8057) (actual time=18904.661..18904.748 rows=38 loops=1)
  Sort Key: overlay_polygon.overlay_polygon_id
  Sort Method: quicksort  Memory: 126kB
  Buffers: shared hit=58690 read=17134
  I/O Timings: read=9924.328
  ->  Hash Join  (cost=544.88..556.78 rows=38 width=8057) (actual time=18903.697..18904.210 rows=38 loops=1)
        Hash Cond: (overlay_polygon.overlay_polygon_id = op.overlay_polygon_id)
        Buffers: shared hit=58690 read=17134
        I/O Timings: read=9924.328
        ->  Seq Scan on overlay_polygon  (cost=0.00..11.38 rows=38 width=8045) (actual time=0.127..0.411 rows=38 loops=1)
              Buffers: shared hit=2 read=9
              I/O Timings: read=0.173
        ->  Hash  (cost=544.41..544.41 rows=38 width=12) (actual time=18903.500..18903.500 rows=38 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 2kB
              Buffers: shared hit=58688 read=17125
              I/O Timings: read=9924.154
              ->  HashAggregate  (cost=543.65..544.03 rows=38 width=8) (actual time=18903.276..18903.379 rows=38 loops=1)
                    Buffers: shared hit=58688 read=17125
                    I/O Timings: read=9924.154
                    ->  Nested Loop Left Join  (cost=0.00..543.46 rows=38 width=8) (actual time=0.052..17169.606 rows=415224 loops=1)
                          Buffers: shared hit=58688 read=17125
                          I/O Timings: read=9924.154
                          ->  Seq Scan on overlay_polygon op  (cost=0.00..11.38 rows=38 width=8038) (actual time=0.004..0.537 rows=38 loops=1)
                                Buffers: shared hit=1 read=10
                                I/O Timings: read=0.279
                          ->  Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) (actual time=4.422..381.991 rows=10927 loops=38)
                                Index Cond: (op.gopm && gopm)
                                Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
                                Rows Removed by Filter: 13353
                                Buffers: shared hit=58687 read=17115
                                I/O Timings: read=9923.875
Total runtime: 18905.293 ms

As you can see, they have comparable cost estimates, although I'm not sure how accurate those cost estimates are. I am dubious about cost estimates that involve PostGIS functions. Both tables have had VACUUM ANALYZE FULL run on them since they were last modified and before running the queries.

Perhaps I am simply unable to read my EXPLAIN ANALYZEs, but I cannot see why these queries have such drastically different run times. Can anyone identify anything? The only possibility I can think of is related to the number of columns involved in the LEFT JOIN.

EDIT 1

Per the suggestion of @ChrisTravers, I increases work_mem and reran the first query. I don't believe this represents a significant improvement.

Executed

SET work_mem='4MB';

(It was 1 MB.)

Then executing the first query gave these results.

GroupAggregate  (cost=544.45..545.12 rows=38 width=8049) (actual time=339910.046..495775.478 rows=38 loops=1)
  Buffers: shared hit=58552 read=17261, temp read=112133 written=112133
  ->  Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=325391.923..491329.208 rows=415224 loops=1)
        Sort Key: op.overlay_polygon_id
        Sort Method: external merge  Disk: 896904kB
        Buffers: shared hit=58552 read=17261, temp read=112133 written=112133
        ->  Nested Loop Left Join  (cost=0.00..543.46 rows=38 width=8049) (actual time=14.698..234266.573 rows=415224 loops=1)
              Buffers: shared hit=58552 read=17261
              ->  Seq Scan on overlay_polygon op  (cost=0.00..11.38 rows=38 width=8045) (actual time=14.612..15.384 rows=38 loops=1)
                    Buffers: shared read=11
              ->  Index Scan using spix_point on point  (cost=0.00..13.99 rows=1 width=200) (actual time=95.262..5451.636 rows=10927 loops=38)
                    Index Cond: (op.geom && geom)
                    Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
                    Rows Removed by Filter: 13353
                    Buffers: shared hit=58552 read=17250
Total runtime: 496936.775 ms

EDIT 2

Well, here's a nice, big smell that I didn't notice before (largely because I have trouble reading the ANALYZE output). Sorry I didn't notice it sooner.

Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=271754.952..534154.573 rows=415224 loops=1)

Estimated rows: 38. Actual rows: over 400K. Ideas, anyone?

Solution

My immediate thought is that this may have to do with work_mem limits. The difference between the plans is that in the first, you join and then aggregate, and in the second you aggregate and the join. This means your aggregation set narrower, which means less memory is used over that operation.

It would be interesting to see what would change if you tried doubling work_mem and trying again.

Edit: Now that we know increasing the work_mem leads to only a modest improvement, the next issue is the sort row estimate. I suspect it is in fact exceeding work_mem here and it is expecting this will be easy because it expects only 38 rows out, but instead gets a lot of rows out. It isn't clear to me where the planner is getting this information because it's pretty clear that the planner estimates (correctly) that 38 rows is the number of rows we expect out of the aggregate. This part is starting to look like a planner bug to me, but I am having a hard time putting my finger on it. It might be worth writing up and bringing up on the pgsql-general email list. It almost looks to me like the planner is getting confused between memory required for the sort and memory required for the aggregate.