I'm looking to see what is the most efficient way of writing SQL in snowflake. Below is the problem statement.
Ideally if all mapping values available in mapping table B, just a left join between TABLE A AND TABLE B would be sufficient like below
with cte1 as ( Select X, T from TABLE A
LEFT JOIN TABLE B on A.X = B.X1);
But in some cases Mapping table B doesn't have new X column mapping yet so i need to apply API call over distinct column X of TABLE A which doesn't have mapping values in mapping table B, so minimal API calls can be made.
missing_values_table as (Select function1(distinct X) from cte1 where T is NULL);
After above step merge the result with MAPPING TABLE B to update with new records and finally do LEFT JOIN between TABLE A and Table B again to get the output table.
TABLE A
Column X |
---|
ab |
zx |
ab |
cb |
cc |
cc |
Table B
Column X1 | Column T |
---|---|
ab | tom |
zx | banks |
output tables:
Column X | Column T |
---|---|
ab | tom |
zx | banks |
ab | tom |
cb | silver |
cc | gold |
cc | gold |
Table B
Column X1 | Column T |
---|---|
ab | tom |
zx | banks |
cb | silver |
cc | gold |
with enriched_data as (
SELECT X,
MIN(B.T) AS T, -- Use B.T for existing mapping
CASE WHEN MIN(B.T) IS NULL THEN function1(MIN(X)) END AS new_T
FROM TABLE A
LEFT JOIN TABLE B ON A.X = B.X1 -- Join to check for existing mapping
GROUP BY A.X
)
-- Update TABLE B with newly generated values
MERGE INTO TABLE B AS target
USING (
SELECT X, new_T FROM enriched_data WHERE new_T IS NOT NULL
) AS source
ON target.X1 = source.X
WHEN MATCHED THEN UPDATE SET target.T = source.T; -- Update existing T
-- Final join with updated TABLE B
SELECT A.X, COALESCE(B.T, new_T) AS T1
FROM TABLE A
LEFT JOIN enriched_data B ON A.X = B.X;
I tried above approach, not sure if this is efficient. Please let me know if there is better way so that i can avoid two LEFT JOINS, seems repetitive
Why you're not use SQL Set Difference Operation, to get the difference between Table A.x
and Table B.x1
Query in MySQL:
select distinct A.x from A
where A.x NOT IN (Select distinct B.x1 from B)
Result: db<>fiddle