Search code examples
postgresqlplpgsql

Can I count on a simple case/switch plpgsql function to use a sane query plan?


I am trying to implement a postgres function with optional parameters. The way that I have gone about this is to implement stable sql functions with required parameters, and then I delegate to those depending on a simple IF/ELSEIF case chain.

Here is a simplified example:

Setup:

CREATE TABLE IF NOT EXISTS test_tab1(
    field1 int PRIMARY KEY,
    field2 int
);
CREATE TABLE IF NOT EXISTS test_tab2(
    field1 int PRIMARY KEY,
    field2 int
);
INSERT INTO test_tab1 SELECT FLOOR(RANDOM()*10000), FLOOR(RANDOM()*10000) FROM GENERATE_SERIES(1,10);
INSERT INTO test_tab2 SELECT * FROM test_tab1;
    
CREATE OR REPLACE FUNCTION test_tab1_func()
    RETURNS SETOF test_tab1
    LANGUAGE sql STABLE 
    AS $$
        SELECT * FROM test_tab1;
$$;

CREATE OR REPLACE FUNCTION test_tab2_func()
    RETURNS SETOF test_tab1
    LANGUAGE sql STABLE 
    AS $$
        SELECT * FROM test_tab2;
$$;

The function that is the subject of my question:

CREATE OR REPLACE FUNCTION test_func(foo int DEFAULT null)
    RETURNS SETOF test_tab1
    LANGUAGE plpgsql STABLE 
    AS $$
        BEGIN
            IF (foo IS null) THEN
                RETURN QUERY
                SELECT * FROM test_tab1_func();
            ELSE
                RETURN QUERY
                SELECT * FROM test_tab2_func();
            END IF;
        END;
$$;

EXPLAIN SELECT * FROM test_func();

This EXPLAIN returns:

Function Scan on test_func  (cost=0.25..10.25 rows=1000 width=8)

My experiments with more complex queries show that this Function Scan estimate is not very "smart", and will return basically this same result no matter the complexity of the function body. Since the EXPLAIN output is so uninformative, I have come here for more info on the execution plan.

My question is this: Since the DB presumably can plan for test_tab1_func, and it can plan for test_tab2_func, can I count on the DB to use the plan that it would ordinarily devise for test_tab1_func for test_func when foo is null, and the plan that it would ordinarily devise for test_tab2_func for test_func otherwise? If not, is there a better way to implement this sort of "case/switch" function logic that postgres can more readily plan around?


Solution

  • Can I count on a simple case/switch plpgsql function to use a sane query plan?

    Yes, you can always count on "sane" query plans - in plain SQL or nested in any function. It's always the same query planner, just called from a different context. An "insane" query plan would be cause for a bug report to the PostgreSQL Bug List <pgsql-bugs@lists.postgresql.org> - after making sure the insanity wasn't all yours.

    No, you cannot count on a sane query plan, since every nested SQL statement in a function gets its own plan - unless the function is inlined as a whole, which is only possible for SQL functions (LANGUAGE sql) that fulfill a couple of conditions. So not applicable to PL/pgSQL functions.

    Your call results in one plan for the calling SELECT * FROM test_func(); and another, separate (nested) plan for SELECT * FROM test_tab1_func();.
    SELECT * FROM test_tab2_func(); in the other leg of the IF construct is not planned at all until control actually reaches it.

    To inspect every detail of what's happening behind the curtains, use auto_explain. It's simple, but you must be superuser. See:

    Related:

    Your example

    Splitting into distinct queries along user input can make sense if that results in a more efficient query plan for at least one branch. But that's not the case for your simple example, where the added overhead far outweighs the possible gain.

    Your function test_func() and call can be replaced with plain SQL:

    SELECT * FROM test_tab1_func() WHERE $foo IS NULL
    UNION ALL
    SELECT * FROM test_tab2_func() WHERE $foo IS NOT NULL;