Search code examples
sqlpostgresqlcommon-table-expressiongreenplum

Materialize Common Table Expression in Greenplum


Is there a way to force Greenplum PostgreSQL to materialize a subquery in a WITH clause like what MATERIALIZE and INLINE optimizer hints do as below in Oracle?

WITH dept_count AS (
  SELECT /*+ MATERIALIZE */ deptno, COUNT(*) AS dept_count
  FROM   emp
  GROUP BY deptno)
SELECT ...

I've been searching this for a while, only to find this functionality in Oracle.

I know I can use CREATE TABLE AS, but I have several similar queries, forcing me to drop the temporary table after each query, which is very inconvenient and maybe inefficient.

Update: I tested the following table:

CREATE TABLE test (id: INT);

EXPLAIN WITH test2 AS (SELECT id FROM test)
SELECT COUNT(*) FROM test2;

                                 QUERY PLAN                                     
------------------------------------------------------------------------------------
Aggregate  (cost=0.36..0.37 rows=1 width=8)
   ->  Gather Motion 32:1  (slice1; segments: 32)  (cost=0.01..0.35 rows=1 width=8)
         ->  Aggregate  (cost=0.01..0.01 rows=1 width=8)
               ->  Subquery Scan test2  (cost=0.00..0.00 rows=1 width=0)
                     ->  Seq Scan on test  (cost=0.00..0.00 rows=1 width=4)

I'm using Greenplum Postgresql 8.2


Solution

  • Temporary table

    If you are looking for a temporary table that persists for the duration of your session, use an actual TEMPORARY TABLE.

    CREATE TEMPORARY TABLE t AS
    SELECT deptno, COUNT(*) AS dept_count
    FROM   emp
    GROUP  BY deptno;
    
    SELECT ...
    FROM t ...
    

    CREATE [TEMPORARY] TABLE AS in the manual.

    However, there is no "global" temporary table in PostgreSQL. A temporary tables is only visible to the user who created it and only for the duration of the session it was created in.

    CTEs are only visible in the query they are part of. Never beyond that.

    Temp table exclusive to a single query

    To restrict the visibility of your temp tables to a single query, put them into a transaction and add ON COMMIT DROP, which drops the temp table automatically at the end of the transaction:

    BEGIN;
    CREATE TEMP TABLE t ON COMMIT DROP AS
    SELECT ...
    

    The only use case I can think of, where this would make sense: if you want to create indexes on a huge temporary table:

    CREATE INDEX ON t(col1);
    
    SELECT ..
    FROM t ...;
    
    ROLLBACK;
    

    Or (doesn't make a difference here):

    COMMIT;
    

    If you use ROLLBACK, you can also just use a temp table without ON COMMIT DROP since everything is rolled back anyways.