Search code examples
sqlsnowflake-cloud-data-platform

Combine three results with different columns


I have Output A:

+------+-------+--------+
| year | group | amount |
+------+-------+--------+
| 2018 | a     | 5      |
+------+-------+--------+
| 2019 | a     | 6      |
+------+-------+--------+
| 2021 | a     | 1239   |
+------+-------+--------+

I have Output B:

+------+-------+--------+
| year | group | amount |
+------+-------+--------+
| 2021 | b     | 12     |
+------+-------+--------+
| 2022 | b     | 34     |
+------+-------+--------+
| 2023 | b     | 5691   |
+------+-------+--------+

I have Output C:

+------+-------+--------+
| year | group | amount |
+------+-------+--------+
| 2018 | total | 12     |
+------+-------+--------+
| 2019 | total | 34     |
+------+-------+--------+
| 2020 | total | 5691   |
+------+-------+--------+
| 2021 | total | 12341  |
+------+-------+--------+
| 2022 | total | 96812  |
+------+-------+--------+
| 2023 | total | 100123 |
+------+-------+--------+ 

I want to combine Output A and Output B and Output C so it looks like this:

+------+--------+----------+----------+
| year | total_ | a_amount | b_amount |
|      | amount |          |          |
+------+--------+----------+----------+
| 2018 | 12     | 5        | 0        |
+------+--------+----------+----------+
| 2019 | 34     | 6        | 0        |
+------+--------+----------+----------+
| 2020 | 5691   | 0        | 0        |
+------+--------+----------+----------+
| 2021 | 12341  | 1239     | 12       |
+------+--------+----------+----------+
| 2022 | 96812  | 0        | 34       |
+------+--------+----------+----------+
| 2023 | 100123 | 0        | 5691     |
+------+--------+----------+----------+

My goal is to find the % amount on A vs B (this I would just find on my own but that's my "greater goal"). I just need help on how I can transform A B and C to the Desired Output above. I've tried doing a UNION but that doesn't seem to give me what I need.

Is there a way to do this in Snowflake?


Solution

  • Making some assumptions here, but unless you are looking for a way to dynamically pivot based on your GROUP columns, then this should be a simple LEFT JOIN or FULL OUTER JOIN situation. I am assuming in the following code that you have all of years in your C output, but if that isn't the case, then switching to a FULL OUTER JOIN with coalesce on your year would work, as well:

    First, I'm using CTE to select your sample data:

    with table_a as (
    select *
    from (values(2018, 'a', 5),
                (2019, 'a', 6),
                (2021, 'a', 1239))x(year,group_val,amount)
    ), table_b as (
    select *
    from (values(2021, 'b', 12),
                (2022, 'b', 34),
                (2023, 'b', 5691))x(year,group_val,amount)
    ), table_c as (
    select *
    from (values(2018, 'total', 12),
                (2019, 'total', 34),
                (2020, 'total', 5691),
                (2021, 'total', 12341),
                (2022, 'total', 95812),
                (2023, 'total', 100123))x(year,group_val,amount)
    )
    

    Then I just LEFT JOIN and use IFNULL to get ZEROS where you'd have NULL values:

    select c.year
         , c.amount as total_amount
         , IFNULL(a.amount,0) as a_amount
         , IFNULL(b.amount,0) as b_amount
    from table_c c
    left join table_a a
      on c.year = a.year
    left join table_b b
      on c.year = b.year;