Search code examples
joinsnowflake-cloud-data-platformsql-view

Snowflake joins performance improvement


I need to create a view on top of a table which contains 1300+ columns. New data will be loaded to table every quarter(Rows in millions). While creating view I need to join other table with the base table. and i also needed to add a recent row indicator in view.

CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,
base_tbl.col1334, 1 as Is_Latest_Quarter 
FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID 
where snapshot_dt=(select max(snapshot_dt) from base_tbl)


union all


SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,
base_tbl.col1334,0 as Is_Latest_Quarter 
FROM base_tbl full outer JOIN lkp_tbl 
on base_tbl.CUST_ID = lkp_tbl.CUST_ID 
where snapshot_dt!=(select max(snapshot_dt) from base_tbl);

After creating this view the performance of the query is too slow even if we are querying 100 rows. Is there a way in which we can create view in more efficient way. If not how can i increase performance?


Solution

  • just use one SELECT statement and use a CASE statement to calculate Is_Latest_Quarter

    UPDATED WITH (ALMOST) ACTUAL SQL

    CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
    SELECT {list of columns you want to include}
    ,CASE WHEN snapshot_dt=(select max(snapshot_dt) from base_tbl) THEN 1 
     ELSE 0 END as Is_Latest_Quarter
    FROM base_tbl 
    full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID 
    

    Alternatively, if Snowflake doesn't like that in-line subquery, your could use a CTE something like:

    CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
        WITH MAX_DATE AS (SELECT MAX(Ssnapshot_dt) AS max_snapshot_dt FROM base_tbl),
        SELECT {list of columns you want to include}
        ,CASE WHEN max_date.max_snapshot_dt is not null  THEN 1 
         ELSE 0 END as Is_Latest_Quarter
        FROM base_tbl 
        full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID
        LEFT OUTER JOIN MAX_DATE ON base_tbl.snapshot_dt = max_date.max_snapshot_dt