Search code examples
sqlitejoingroup-bycommon-table-expressionfull-outer-join

FULL OUTER JOIN emulation and aggregate function (e.g. SUM)


I have two SQLite tables:

CREATE TABLE T_A (year_A INT, amount_A DOUBLE);

  year_A   | amount_A
----------------------
    2020   |    100.0
    2020   |    200.0
    2021   |    300.0
    2021   |    400.0

CREATE TABLE T_B (year_B INT, amount_B DOUBLE);

  year_B   | amount_B
----------------------
    2021   |   1000.0
    2021   |   2000.0
    2022   |   3000.0
    2022   |   4000.0

I would like a VIEW with the SUM of amount_A and amount_B on every year present either in T_A or T_B:

View Sums

  year   | SUM(amount_A) | SUM(amount_B) 
------------------------------------------
  2020   |      300.0    |        0.0    
  2021   |      700.0    |     3000.0    
  2022   |        0.0    |     7000.0    

If I use an INNER JOIN in my query, all I get is a result for year 2021. A FULL OUTER JOIN is what I need. As it does not exist in SQLite, I tried to apply this workaround. But I cannot get it work properly as a SUM is involved too.

SQLite supports FULL OUTER JOIN since version 3.39.0. I use SQLite through Python 3.10. It's a pain to upgrade to a newer version.


Solution

  • SQLite supports FULL OUTER JOIN since version 3.39.0.

    First you must aggregate inside each of the tables and then do a FULL join on the aggregated results:

    WITH 
      cte_A AS (SELECT year_A, SUM(amount_A) AS sum_A FROM T_A GROUP BY year_A),
      cte_B AS (SELECT year_B, SUM(amount_B) AS sum_B FROM T_B GROUP BY year_B)
    SELECT COALESCE(a.year_A, b.year_B) year,
           COALESCE(a.sum_A, 0) AS sum_A,
           COALESCE(b.sum_B, 0) AS sum_B
    FROM cte_A AS a FULL OUTER JOIN cte_B AS b
    ON b.year_B = a.year_A
    ORDER BY year;
    

    For previous versions of SQLite use UNION ALL and then aggregate:

    WITH cte AS (
      SELECT year_A AS year, amount_A, 0 AS amount_B FROM T_A
      UNION ALL
      SELECT year_B, 0, amount_B FROM T_B
    )
    SELECT year,
           SUM(amount_A) AS sum_A,
           SUM(amount_B) AS sum_B
    FROM cte
    GROUP BY year
    ORDER BY year;
    

    See the demo.