postgresql count sql-function explain postgresql-9.6

Understanding COUNT behavior in queries vs. EXPLAIN vs. functions

I am curious to understand (and perhaps improve) an issue I am having with PostgreSQL 9.6. Names simplified but everything else is taken from a psql session.

I start with a materialized view, mv.

First, I create two simple functions:

CREATE FUNCTION count_mv() RETURNS BIGINT AS $$
SELECT COUNT(*) FROM mv;
$$ LANGUAGE SQL STABLE PARALLEL SAFE;

and

CREATE FUNCTION mv_pks() RETURNS TABLE (table_pk INTEGER) AS $$
SELECT table_pk FROM mv;
$$ LANGUAGE SQL STABLE PARALLEL SAFE;

Let's time some queries.

db=>\timing on

I can count results from the materialized view very quickly.

db=> SELECT COUNT(*) FROM mv;
  count
---------
 2567883
(1 row)

Time: 79.803 ms

Let's see how it's doing this.

db=> EXPLAIN ANALYZE SELECT COUNT(*) FROM mv;
                                                                  QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=41331.24..41331.25 rows=1 width=8) (actual time=765.681..765.681 rows=1 loops=1)
   ->  Gather  (cost=41330.62..41331.23 rows=6 width=8) (actual time=765.557..765.670 rows=7 loops=1)
         Workers Planned: 6
         Workers Launched: 6
         ->  Partial Aggregate  (cost=40330.62..40330.63 rows=1 width=8) (actual time=760.175..760.175 rows=1 loops=7)
               ->  Parallel Seq Scan on mv  (cost=0.00..39261.09 rows=427809 width=0) (actual time=0.014..397.952 rows=366840 loops=7)
 Planning time: 0.326 ms
 Execution time: 769.934 ms
(8 rows)

Good. So it's taking advantage of multiple workers. But why is the query so much slower when using EXPLAIN ANALYZE?

Now I use the count_mv() function, which has the same underlying SQL and is declared STABLE.

db=> select count_mv();
  count_mv
------------
    2567883
(1 row)

Time: 406.058 ms

Whoa! Why is this slower than the same SQL on the materialized view? And a lot slower! Is it not taking advantage of parallel workers, and if not, why not?

BEGIN EDIT

As proposed in an answer below, I loaded the auto_explain module and checked the log output for EXPLAIN on the function call.

    Query Text:
    SELECT COUNT(*) FROM mv;

     Finalize Aggregate  (cost=41331.60..41331.61 rows=1 width=8) (actual time=1345.446..1345.446 rows=1 loops=1)
       ->  Gather  (cost=41330.97..41331.58 rows=6 width=8) (actual time=1345.438..1345.440 rows=1 loops=1)
            Workers Planned: 6
            Workers Launched: 0
             ->  Partial Aggregate  (cost=40330.97..40330.99 rows=1 width=8) (actual time=1345.435..1345.435 rows=1 loops=1)
                  ->  Parallel Seq Scan on mv  (cost=0.00..39261.38 rows=427838 width=0) (actual time=0.020..791.022 rows=2567883 loops=1)

The new question is why 6 workers are planned but none are launched. The server is otherwise idle, the configuration is the same, and the query is the same.

END EDIT

All right. So what if I do this:

db=> SELECT COUNT(*) FROM mv_pks();
  count
---------
 2567883
(1 row)

Time: 72.687 ms

The same performance as counting rows on the materialized view directly without using EXPLAIN ANALYZE, but you'll have to trust me here: the performance of this function depends on the state of the materialized view when the function is created. The fast timing here is the result of creating the function when the table is empty. If I recreate the function when the table is full, the function requires over 1000 ms to run!

To summarize my questions:

Why is the SQL query inside a STABLE SQL function taking no parameters so much slower than a query outside that function.
Why is the SQL query so much slower when using EXPLAIN ANALYZE?
Why do I get all different results when counting rows from a function that can either be equivalently fast to counting rows on the materialized view or slower than any other method, depending on when the function was created?

Thanks in advance!

Solution

For 1), you can find out yourself using auto_explain, which can show plans for queries inside functions. Does it use a parallel plan?

For 2) that is the overhead of measuring, which depends on the platform, but can be high.

For 3) compare the SQL plans in both cases. Queries in SQL functions are not cached, so I don't have an explanation why it should behave like this. Did you repeat the test multiple time to rule out that you see the effect of reading from disk versus reading from cache?