Search code examples
sqlpostgresqljoinplpgsqlset-returning-functions

Record returned from function has columns concatenated


I have a table which stores account changes over time. I need to join that up with two other tables to create some records for a particular day, if those records don't already exist.

To make things easier (I hope), I've encapsulated the query that returns the correct historical data into a function that takes in an account id, and the day.

If I execute "Select * account_servicetier_for_day(20424, '2014-08-12')", I get the expected result (all the data returned from the function in separate columns). If I use the function within another query, I get all the columns joined into one:

("2014-08-12 14:20:37",hollenbeck,691,12129,20424,69.95,"2Mb/1Mb 20GB Limit",2048,1024,20.000)

I'm using "PostgreSQL 9.2.4 on x86_64-slackware-linux-gnu, compiled by gcc (GCC) 4.7.1, 64-bit".

Query:

Select
    '2014-08-12' As day, 0 As inbytes, 0 As outbytes, acct.username, acct.accountid, acct.userid,
    account_servicetier_for_day(acct.accountid, '2014-08-12')
From account_tab acct
Where acct.isdsl = 1
    And acct.dslservicetypeid Is Not Null
    And acct.accountid Not In (Select accountid From dailyaccounting_tab Where Day = '2014-08-12')
Order By acct.username

Function:

CREATE OR REPLACE FUNCTION account_servicetier_for_day(_accountid integer, _day timestamp without time zone) RETURNS setof account_dsl_history_info AS
$BODY$
DECLARE _accountingrow record;
BEGIN
  Return Query
  Select * From account_dsl_history_info
  Where accountid = _accountid And timestamp <= _day + interval '1 day - 1 millisecond'
  Order By timestamp Desc 
  Limit 1;
END;
$BODY$ LANGUAGE plpgsql;

Solution

  • Generally, to decompose rows returned from a function and get individual columns:

    SELECT * FROM account_servicetier_for_day(20424, '2014-08-12');


    As for the query:

    Postgres 9.3 or newer

    Cleaner with JOIN LATERAL:

    SELECT '2014-08-12' AS day, 0 AS inbytes, 0 AS outbytes
         , a.username, a.accountid, a.userid
         , f.*   -- but avoid duplicate column names!
    FROM   account_tab a
         , account_servicetier_for_day(a.accountid, '2014-08-12') f  -- <-- HERE
    WHERE  a.isdsl = 1
    AND    a.dslservicetypeid IS NOT NULL
    AND    NOT EXISTS (
       SELECT FROM dailyaccounting_tab
       WHERE  day = '2014-08-12'
       AND    accountid = a.accountid
       )
    ORDER  BY a.username;
    

    The LATERAL keyword is implicit here, functions can always refer earlier FROM items. The manual:

    LATERAL can also precede a function-call FROM item, but in this case it is a noise word, because the function expression can refer to earlier FROM items in any case.

    Related:

    Short notation with a comma in the FROM list is (mostly) equivalent to a CROSS JOIN LATERAL (same as [INNER] JOIN LATERAL ... ON TRUE) and thus removes rows from the result where the function call returns no row. To retain such rows, use LEFT JOIN LATERAL ... ON TRUE:

    ...
    FROM  account_tab a
    LEFT  JOIN LATERAL account_servicetier_for_day(a.accountid, '2014-08-12') f ON TRUE
    ...
    

    Also, don't use NOT IN (subquery) when you can avoid it. It's the slowest and most tricky of several ways to do that:

    I suggest NOT EXISTS instead.

    Postgres 9.2 or older

    You can call a set-returning function in the SELECT list (which is a Postgres extension of standard SQL). For performance reasons, this is best done in a subquery. Decompose the (well-known!) row type in the outer query to avoid repeated evaluation of the function:

    SELECT '2014-08-12' AS day, 0 AS inbytes, 0 AS outbytes
         , a.username, a.accountid, a.userid
         , (a.rec).*   -- but be wary of duplicate column names!
    FROM  (
       SELECT *, account_servicetier_for_day(a.accountid, '2014-08-12') AS rec
       FROM   account_tab a
       WHERE  a.isdsl = 1
       AND    a.dslservicetypeid Is Not Null
       AND    NOT EXISTS (
           SELECT FROM dailyaccounting_tab
           WHERE  day = '2014-08-12'
           AND    accountid = a.accountid
          )
       ) a
    ORDER  BY a.username;
    

    Related answer by Craig Ringer with an explanation, why it's better not to decompose on the same query level:

    Postgres 10 removed some oddities in the behavior of set-returning functions in the SELECT: