sql function postgresql set-returning-functions lateral

Create table from multiple calls to a function returning record

I have a function that does some basic statistics on a range of data based on a start and stop timestamp:

CREATE OR REPLACE FUNCTION cal(TIMESTAMP, TIMESTAMP, OUT Date_Time timestamp with time zone, OUT avg numeric, OUT stddev numeric, OUT Rstedv_per numeric) 
AS $$
SELECT
    max(datetime) as Date_Time,
    avg(SO2) AS Mean,
    stddev_samp(so2) as STD_DEV,
    stddev_samp(so2)/avg(SO2)*100 as Rstedv_Per
FROM Table43
WHERE datetime > $1 AND datetime < $2;
$$
 LANGUAGE SQL;

This works great with a simple single select such as this:

select * FROM
 cal('2014-08-02 05:29:00', '2014-08-02 05:32:00')

But now I am having trouble creating another function, or even a select statement that can combine multiple calls of the 'cal' function. For example I want to return a table that contains three time periods. So the return would be 4 columns by 3 rows:

'2014-08-02 05:29:00', '2014-08-02 05:32:00'
'2014-08-02 05:35:00', '2014-08-02 05:39:00'
'2014-08-02 05:45:00', '2014-08-02 05:49:00'

Solution

Use a VALUES expression to provide multiple rows of input dates. Then ...

For all versions of Postgres

SELECT cal(a, b)
FROM  (
   VALUES 
      ('2014-08-02 05:29'::timestamp, '2014-08-02 05:32'::timestamp)
    , ('2014-08-02 05:35', '2014-08-02 05:39')
    , ('2014-08-02 05:45', '2014-08-02 05:39')
   ) v(a, b);

You can replace the VALUES expression with an actual table.

This returns whole rows as a single column (instead of individual columns). You could decompose in place with (cal(a, b)).*. While this works, it is inefficient. Due to a weakness in the Postgres parser, this would result in multiple evaluation of the function. Detailed explanation:

How to avoid multiple function evals with the (func()).* syntax in an SQL query?

Instead, use a subquery for better performance:

SELECT (rec).*
FROM  (
    SELECT cal(a, b)
    FROM  (
       VALUES 
          ('2014-08-02 05:29'::timestamp, '2014-08-02 05:32'::timestamp)
        , ('2014-08-02 05:35', '2014-08-02 05:39')
        , ('2014-08-02 05:45', '2014-08-02 05:39')
       ) v(a, b)
   ) sub;

SQL Fiddle (for pg 8.3 to demonstrate it works in old versions).

Since set-returning functions in the SELECT list are frowned upon by some, and are non-standard SQL.

Postgres 9.3+

That's the main reason why Postgres 9.3 introduced (SQL-standard compliant) LATERAL JOIN:

SELECT f.*
FROM  (
   VALUES 
      ('2014-08-02 05:29'::timestamp, '2014-08-02 05:32'::timestamp)
    , ('2014-08-02 05:35', '2014-08-02 05:39')
    , ('2014-08-02 05:45', '2014-08-02 05:39')
   ) v(a,b)
  , cal(v.a, v.b) f;

The LATERAL JOIN is implicit here, since the second item in the FROM list references the previous table explicitly. Details:

Find most common elements in array with a group by

SQL Fiddle for current Postgres 9.3.