Infamously primary key constraints are not enforced in snowflake sql:
-- Generating a table with 4 rows that contain duplicates and NULLs:
CREATE OR REPLACE TEMP TABLE PRIMARY_KEY_TEST AS
SELECT
*
FROM (
SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
)
;
SELECT *
FROM PRIMARY_KEY_TEST
;
PK | TEXT |
---|---|
1 | TEST_TEXT |
1 | TEST_TEXT |
NULL | NULL |
NULL | NULL |
-- These constraints will NOT throw any errors in Snowflake
ALTER TABLE PRIMARY_KEY_TEST ADD PRIMARY KEY (PK);
ALTER TABLE PRIMARY_KEY_TEST ADD UNIQUE (TEXT);
However knowing that a set of colums has values that are uniuqe for every row and never NULL
is vital to check when updating a set of data.
So I'm looking for a easy to write and read (ideally 1-2 lines) piece of code (proably based on some Snowflake function) that throws an error if a set of columns no longer forms a viable primary key in Snowflake SQL.
Any Suggestions?
So I'm looking for a easy to write and read (ideally 1-2 lines) piece of code (proably based on some Snowflake function) that throws an error if a set of columns no longer forms a viable primary key in Snowflake SQL
Such test query is easy to write using QUALIFY
and windowed COUNT. The pattern is to place primary key column list into PARTITION BY part and search for non-unique values, additional check for nulls could be added too. If the column list is a valid candidate for Primary key, it will not return any rows, if there are rows violating the rules they will be returned:
-- checking if PK is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL;
-- chekcing if TEXT column is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY TEXT) > 1
OR TEXT IS NULL;
-- chekcing if PK,TEXT columns are applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK,TEXT) > 1
OR PK IS NULL
OR TEXT IS NULL;
I'd still prefer code that can throw an error though
It is possible using Snowflake Scripting and RAISE exception:
BEGIN
LET my_exception EXCEPTION (-20002, 'Columns cannot be used as PK.');
IF (EXISTS(SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL
)) THEN
RAISE my_exception;
END IF;
END;
-20002 (P0001): Uncaught exception of type 'MY_EXCEPTION' on line 8 at position 5 : Columns cannot be used as PK.