Two small postgis queries alone work fine:
SELECT a.name, a.population, a.place
FROM SH_POINT a
WHERE a.place IN ('hamlet', 'city', 'town', 'village');
Returns 3700 rows in only 0.26s.
SELECT b.name,b.population
FROM SH_POLY b
WHERE b.boundary='administrative' AND b.admin_level='8';
Return 1161 rows in only 0.35s.
Now I want to combine these results simply based on the name. With such small datasets it should be no problem but
SELECT a.name, a.population, a.place, b.population
FROM SH_POINT a, SH_POLY b
WHERE a.place in ('hamlet', 'city', 'town', 'village')
AND b.boundary='administrative' AND b.admin_level='8'
AND a.name = b.name;
takes over 16 minutes for only 1273 rows!
Other variants like
WITH
a AS (SELECT name, population, place
FROM SH_POINT
WHERE place IN ('hamlet', 'city', 'town', 'village') ),
b AS (SELECT name, population
FROM SH_POLY
WHERE boundary='administrative' AND admin_level='8' )
SELECT a.name, a.population, a.place, b.population
FROM a, b
WHERE a.name = b.name;
fail, too: Time: 990295,242 ms (16:30,295)
SH_POINT
and SH_POLY
are views to a normal planet_osm_* postgis table to reduce the interesting data to a specific area. (As you can see in the first two single queries this works fine.)
CREATE VIEW SH_POINT AS SELECT * FROM planet_osm_point WHERE
way @ (SELECT ST_Collect(way) FROM planet_osm_polygon
WHERE name='Schleswig-Holstein' AND admin_level='4')
AND ST_Within(way, (SELECT ST_COLLECT(way) FROM planet_osm_polygon WHERE
name='Schleswig-Holstein' AND admin_level='4') );
A query analysis for another variant:
EXPLAIN (ANALYZE, BUFFERS) SELECT a.name, a.population, a.place, b.population from SH_POINT a FULL OUTER JOIN SH_POLY b ON a.name = b.name WHERE a.place IN ('hamlet', 'city', 'town', 'village') AND b.boundary='administrative' AND b.admin_level='8';
Time: 1008286,628 ms (16:48,287)
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-------
Nested Loop (cost=242.04..304.10 rows=1 width=34) (actual time=524.970..1008283.783 rows=1273 loops=1)
Buffers: shared hit=811729 read=193326311
InitPlan 1 (returns $0)
-> Aggregate (cost=57.88..57.89 rows=1 width=32) (actual time=0.289..0.291 rows=1 loops=1)
Buffers: shared hit=3 read=22
-> Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_1 (cost=52.49..56.51 rows=1 width=189) (actual time=0.084..0.089 rows=2 loops=1)
Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
Heap Blocks: exact=2
Buffers: shared hit=3 read=6
-> BitmapAnd (cost=52.49..52.49 rows=1 width=0) (actual time=0.073..0.075 rows=0 loops=1)
Buffers: shared hit=2 read=5
-> Bitmap Index Scan on planet_osm_polygon_name_idx (cost=0.00..5.07 rows=67 width=0) (actual time=0.047..0.047 rows=2 loops=1)
Index Cond: (name = 'Schleswig-Holstein'::text)
Buffers: shared hit=2 read=2
-> Bitmap Index Scan on planet_osm_polygon_adminlevel_idx (cost=0.00..47.17 rows=4098 width=0) (actual time=0.023..0.023 rows=106 lo
ops=1)
Index Cond: (admin_level = '4'::text)
Buffers: shared read=3
InitPlan 2 (returns $1)
-> Aggregate (cost=57.88..57.89 rows=1 width=32) (actual time=0.119..0.121 rows=1 loops=1)
Buffers: shared hit=25
-> Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_2 (cost=52.49..56.51 rows=1 width=189) (actual time=0.030..0.033 rows=2 loops=1)
Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
Heap Blocks: exact=2
Buffers: shared hit=9
-> BitmapAnd (cost=52.49..52.49 rows=1 width=0) (actual time=0.027..0.028 rows=0 loops=1)
Buffers: shared hit=7
-> Bitmap Index Scan on planet_osm_polygon_name_idx (cost=0.00..5.07 rows=67 width=0) (actual time=0.015..0.015 rows=2 loops=1)
Index Cond: (name = 'Schleswig-Holstein'::text)
Buffers: shared hit=4
-> Bitmap Index Scan on planet_osm_polygon_adminlevel_idx (cost=0.00..47.17 rows=4098 width=0) (actual time=0.010..0.011 rows=106 lo
ops=1)
Index Cond: (admin_level = '4'::text)
Buffers: shared hit=3
InitPlan 3 (returns $2)
-> Aggregate (cost=57.88..57.89 rows=1 width=32) (actual time=0.122..0.124 rows=1 loops=1)
Buffers: shared hit=25
-> Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_3 (cost=52.49..56.51 rows=1 width=189) (actual time=0.032..0.035 rows=2 loops=1)
Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
Heap Blocks: exact=2
Buffers: shared hit=9
-> BitmapAnd (cost=52.49..52.49 rows=1 width=0) (actual time=0.029..0.030 rows=0 loops=1)
Buffers: shared hit=7
-> Bitmap Index Scan on planet_osm_polygon_name_idx (cost=0.00..5.07 rows=67 width=0) (actual time=0.017..0.017 rows=2 loops=1)
Index Cond: (name = 'Schleswig-Holstein'::text)
Buffers: shared hit=4
-> Bitmap Index Scan on planet_osm_polygon_adminlevel_idx (cost=0.00..47.17 rows=4098 width=0) (actual time=0.010..0.010 rows=106 loops=1)
Index Cond: (admin_level = '4'::text)
Buffers: shared hit=3
InitPlan 4 (returns $3)
-> Aggregate (cost=57.88..57.89 rows=1 width=32) (actual time=0.192..0.193 rows=1 loops=1)
Buffers: shared hit=25
-> Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_4 (cost=52.49..56.51 rows=1 width=189) (actual time=0.026..0.028 rows=2 loops=1)
Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
Heap Blocks: exact=2
Buffers: shared hit=9
-> BitmapAnd (cost=52.49..52.49 rows=1 width=0) (actual time=0.023..0.024 rows=0 loops=1)
Buffers: shared hit=7
-> Bitmap Index Scan on planet_osm_polygon_name_idx (cost=0.00..5.07 rows=67 width=0) (actual time=0.013..0.013 rows=2 loops=1)
Index Cond: (name = 'Schleswig-Holstein'::text)
Buffers: shared hit=4
-> Bitmap Index Scan on planet_osm_polygon_adminlevel_idx (cost=0.00..47.17 rows=4098 width=0) (actual time=0.009..0.009 rows=106 loops=1)
Index Cond: (admin_level = '4'::text)
Buffers: shared hit=3
-> Index Scan using planet_osm_point_place on planet_osm_point (cost=0.28..33.31 rows=1 width=29) (actual time=3.592..85.382 rows=3699 loops=1)
Index Cond: ((way @ $0) AND (way @ $1))
Filter: ((place = ANY ('{hamlet,city,town,village}'::text[])) AND st_within(way, $1))
Rows Removed by Filter: 4951
Buffers: shared hit=49 read=4234
-> Bitmap Heap Scan on planet_osm_polygon (cost=10.19..39.21 rows=1 width=24) (actual time=272.552..272.555 rows=0 loops=3699)
Recheck Cond: ((way @ $2) AND (name = planet_osm_point.name))
Rows Removed by Index Recheck: 5
Filter: ((boundary = 'administrative'::text) AND (admin_level = '8'::text) AND st_within(way, $3))
Rows Removed by Filter: 0
Heap Blocks: exact=1369 lossy=912
Buffers: shared hit=811680 read=193322077
-> BitmapAnd (cost=10.19..10.19 rows=1 width=0) (actual time=272.290..272.290 rows=0 loops=3699)
Buffers: shared hit=811480 read=193319438
-> Bitmap Index Scan on planet_osm_polygon_way_idx (cost=0.00..4.87 rows=45 width=0) (actual time=271.210..271.210 rows=2789969 loops=3699)
Index Cond: ((way @ $2) AND (way @ $3))
Buffers: shared hit=811430 read=193304692
-> Bitmap Index Scan on planet_osm_polygon_name_idx (cost=0.00..5.07 rows=67 width=0) (actual time=0.018..0.018 rows=2 loops=3699)
Index Cond: (name = planet_osm_point.name)
Buffers: shared hit=50 read=14746
Planning:
Buffers: shared hit=10
Planning Time: 1.283 ms
Execution Time: 1008284.465 ms
The only workaround I have found so far is to store both queries in new tables:
CREATE TABLE ppoint AS SELECT a.name, a.population, a.place
FROM SH_POINT a
WHERE a.place IN ('hamlet', 'city', 'town', 'village') ;
CREATE TABLE ppoly AS SELECT b.name, b.population
FROM SH_POLY b
WHERE b.boundary='administrative' AND b.admin_level='8';
SELECT a.name, a.population, a.place, b.population
FROM ppoint a, ppoly b
WHERE a.name=b.name;
This works in less than 3ms but is not a desired option here.
Thanks for your help!
The planner has rewritten the query embedding the view code in an inefficient way.
Using the CTE, you can force it to first compute (materialize) the selection from the views:
WITH
a AS MATERIALIZED (SELECT name, population, place
FROM SH_POINT
WHERE place IN ('hamlet', 'city', 'town', 'village') ),
b AS MATERIALIZED (SELECT name, population
FROM SH_POLY
WHERE boundary='administrative' AND admin_level='8' )
SELECT a.name, a.population, a.place, b.population
FROM a, b
WHERE a.name = b.name;