Search code examples
postgresqlinner-joincommon-table-expression

Common Table Expression [CTE] and INNER JOIN


My question is very simple. Here is my code:

 WITH P1 as  
(
SELECT (count(*)) as Premier_trimestre
FROM import.dvf1,
    (SELECT commune FROM import.dvf1) as commune
WHERE   date_mutation BETWEEN '2020-01-01' AND '2020-03-31' 
 GROUP BY dvf1.commune
)
, P2 as  
(
SELECT (count(*)) as Deuxieme_trimestre
FROM import.dvf1, (select commune FROM import.dvf1) as commune
WHERE   date_mutation BETWEEN '2020-04-01' AND '2020-06-30' 
 GROUP BY dvf1.commune
)


    SELECT * 
FROM
     P1 t1
     INNER JOIN 
     P2 t2
     ON t1.commune = t2.commune

When I run it, I get:

ERROR:  column t1.commune does not exist
LINE 23:      ON t1.commune = t2.commune
                ^
SQL state: 42703
Character: 471

I don't understand why.I tried to rename commune, to move it, etc., but I cannot find a fix.


Solution

  • The CTEs only yield one column named Premier_trimestre or Deuxieme_trimestre.

    If you want to include the commune column from import.dvf1 then you need to include it in the SELECT list of the CTEs.

    You are also creating a cross join of the table with itself with this part FROM import.dvf1, (SELECT commune FROM import.dvf1) which is most probably the reason why the query takes forever.

    I think what you intended is:

    WITH P1 as  
    (
      SELECT dvf1.commune, -- this "exposes" the commune column
             count(*) as Premier_trimestre
      FROM import.dvf1
      WHERE dvf1.date_mutation BETWEEN '2020-01-01' AND '2020-03-31' 
      GROUP BY dvf1.commune
    )
    , P2 as 
    (
      SELECT dvf1.commune,  
             count(*) as Deuxieme_trimestre
      FROM import.dvf1
      WHERE dvf1.date_mutation BETWEEN '2020-04-01' AND '2020-06-30' 
      GROUP BY dvf1.commune
    )
    SELECT * 
    FROM P1     
      JOIN P2 ON p1.commune = p2.commune
    

    However, you can simplify this substantially by using filtered aggregation:

    SELECT commune, 
           count(*) filter (where extract(month from dvf1.date_mutation) in (1,2,3)) as Premier_trimestre, 
           count(*) filter (where extract(month from dvf1.date_mutation) in (4,5,6)) as Deuxieme_trimestre
    FROM import.dvf1 
    WHERE dvf1.date_mutation BETWEEN '2020-01-01' AND '2020-06-30'  -- this covers both ranges
    GROUP dvf1.commune