Search code examples
sqlpostgresqlquery-optimizationplpgsql

Avoid multiple calls on same function when expanding composite result


I have an SQL function retuning a composite result.

CREATE TYPE result_t AS (a int, b int, c int, d numeric);

CREATE OR REPLACE FUNCTION slow_function(int)
RETURNS result_t
AS $$
    -- just some placeholder code to make it slow
    SELECT 0, 0, 0, (
        SELECT sum(ln(i::numeric))
        FROM generate_series(1, $1) i
    )
$$ LANGUAGE sql IMMUTABLE;

When calling the function, I would like to have the parts of the composite type expanded into several columns. That works fine when I call:

SELECT (slow_function(i)).*
FROM generate_series(0, 200) i

a     b     c     d                    
----  ----  ----  -------------------- 
0     0     0     (null)               
0     0     0     0                    
0     0     0     0.6931471805599453   
0     0     0     1.791759469228055
...
Total runtime: 6196.754 ms

Unfortunately, this causes the function to be called once per result column, which is unnecessarily slow. This can be tested by comparing the run time with a query, which directly returns the composite result and runs four times as fast:

SELECT slow_function(i)
FROM generate_series(0, 200) i
...
Total runtime: 1561.476 ms

The example code is also at http://sqlfiddle.com/#!15/703ba/7

How can I get the result with multiple columns without wasting CPU power?


Solution

  • A CTE is not even necessary. A plain subquery does the job.

    SELECT i, (f).*                     -- decompose here
    FROM  (
       SELECT i, (slow_func(i)) AS f    -- do not decompose here
       FROM   generate_series(1, 3) i
       ) sub;
    

    Be sure not to decompose the record (composite result) of the function in the subquery. Do that in the outer query.
    Requires a registered row type. Not possible with anonymous records.

    Or, what @Richard wrote, a LATERAL JOIN works, too. The syntax can be simpler:

    SELECT * FROM generate_series(1, 3) i, slow_func(i) f
    

    LATERAL is applied implicitly in Postgres 9.3 or later.

    A function can stand on its own in the FROM clause, doesn't have to be wrapped in an additional sub-select. Just like a table in its place.

    fiddle with EXPLAIN VERBOSE output for all variants. You can see multiple evaluation of the function if it happens.
    Old sqlfiddle

    COST setting

    Generally (should not matter for this particular query), make sure to apply a high cost setting to your function, so the planner knows to avoid evaluating more often then necessary. Like:

    CREATE OR REPLACE FUNCTION slow_function(int)
      RETURNS result_t
      LANGUAGE sql IMMUTABLE COST 100000 AS
    $func$
        -- expensive body
    $func$;

    The manual:

    Larger values cause the planner to try to avoid evaluating the function more often than necessary.