I'm trying to join the 2 tables below on multiple columns (on big query through dbt cloud).
t1 as
id | 2021_code | 2022_code | 2023_code |
---|---|---|---|
P | 1 | ||
Q | 2 | 3 | |
R | 4 | 5 | 6 |
t2 as
code | label |
---|---|
1 | A |
2 | B |
3 | C |
4 | D |
5 | E |
6 | F |
For each non-null value in the last three columns in t1, I want to have a distinct row with a unique label in the joined table.
I'm expecting the below joined table.
id | code | label |
---|---|---|
P | 1 | A |
Q | 2 | B |
Q | 3 | C |
R | 4 | D |
R | 5 | E |
R | 6 | F |
I've tried the following code but have understandably failed.
select id, code, label from t1 left join t2 on 2021_code = code or 2022_code = code or 2023_code = code
My join condition or approach clearly needs revision. Will be grateful for any inputs.
You can recreate t1 to make 1 code column and then join
SELECT
t1.id, t1.code,t2.label
FROM (
SELECT id, 2021_code AS code FROM t1 WHERE 2021_code IS NOT NULL
UNION ALL
SELECT id, 2022_code AS code FROM t1 WHERE 2022_code IS NOT NULL
UNION ALL
SELECT id, 2023_code AS code FROM t1 WHERE 2023_code IS NOT NULL
) t1
JOIN t2 ON t1.code = t2.code