I am curious to understand (and perhaps improve) an issue I am having with PostgreSQL 9.6. Names simplified but everything else is taken from a psql
session.
I start with a materialized view, mv
.
First, I create two simple functions:
CREATE FUNCTION count_mv() RETURNS BIGINT AS $$
SELECT COUNT(*) FROM mv;
$$ LANGUAGE SQL STABLE PARALLEL SAFE;
and
CREATE FUNCTION mv_pks() RETURNS TABLE (table_pk INTEGER) AS $$
SELECT table_pk FROM mv;
$$ LANGUAGE SQL STABLE PARALLEL SAFE;
Let's time some queries.
db=>\timing on
I can count results from the materialized view very quickly.
db=> SELECT COUNT(*) FROM mv;
count
---------
2567883
(1 row)
Time: 79.803 ms
Let's see how it's doing this.
db=> EXPLAIN ANALYZE SELECT COUNT(*) FROM mv;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=41331.24..41331.25 rows=1 width=8) (actual time=765.681..765.681 rows=1 loops=1)
-> Gather (cost=41330.62..41331.23 rows=6 width=8) (actual time=765.557..765.670 rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
-> Partial Aggregate (cost=40330.62..40330.63 rows=1 width=8) (actual time=760.175..760.175 rows=1 loops=7)
-> Parallel Seq Scan on mv (cost=0.00..39261.09 rows=427809 width=0) (actual time=0.014..397.952 rows=366840 loops=7)
Planning time: 0.326 ms
Execution time: 769.934 ms
(8 rows)
Good. So it's taking advantage of multiple workers. But why is the query so much slower when using EXPLAIN ANALYZE
?
Now I use the count_mv()
function, which has the same underlying SQL and is declared STABLE
.
db=> select count_mv();
count_mv
------------
2567883
(1 row)
Time: 406.058 ms
Whoa! Why is this slower than the same SQL on the materialized view? And a lot slower! Is it not taking advantage of parallel workers, and if not, why not?
BEGIN EDIT
As proposed in an answer below, I loaded the auto_explain
module and checked the log output for EXPLAIN
on the function call.
Query Text:
SELECT COUNT(*) FROM mv;
Finalize Aggregate (cost=41331.60..41331.61 rows=1 width=8) (actual time=1345.446..1345.446 rows=1 loops=1)
-> Gather (cost=41330.97..41331.58 rows=6 width=8) (actual time=1345.438..1345.440 rows=1 loops=1)
Workers Planned: 6
Workers Launched: 0
-> Partial Aggregate (cost=40330.97..40330.99 rows=1 width=8) (actual time=1345.435..1345.435 rows=1 loops=1)
-> Parallel Seq Scan on mv (cost=0.00..39261.38 rows=427838 width=0) (actual time=0.020..791.022 rows=2567883 loops=1)
The new question is why 6 workers are planned but none are launched. The server is otherwise idle, the configuration is the same, and the query is the same.
END EDIT
All right. So what if I do this:
db=> SELECT COUNT(*) FROM mv_pks();
count
---------
2567883
(1 row)
Time: 72.687 ms
The same performance as counting rows on the materialized view directly without using EXPLAIN ANALYZE
, but you'll have to trust me here: the performance of this function depends on the state of the materialized view when the function is created. The fast timing here is the result of creating the function when the table is empty. If I recreate the function when the table is full, the function requires over 1000 ms to run!
To summarize my questions:
STABLE
SQL function taking no parameters so much slower than a query outside that function.EXPLAIN ANALYZE
?Thanks in advance!
For 1), you can find out yourself using auto_explain
, which can show plans for queries inside functions. Does it use a parallel plan?
For 2) that is the overhead of measuring, which depends on the platform, but can be high.
For 3) compare the SQL plans in both cases. Queries in SQL functions are not cached, so I don't have an explanation why it should behave like this. Did you repeat the test multiple time to rule out that you see the effect of reading from disk versus reading from cache?