Search code examples
sql-serveretldata-warehousefact-table

Foreign key referential integrity in the fact tables - T-SQL


I have this fact table:

CREATE TABLE [dbo].[FACT_TABLE]
(
    [A_id] [int] NOT NULL,
    [B_id] [int] NOT NULL,
    [C_id] [int] NOT NULL,
    [D_id] [int] NOT NULL,
    [FACT_total] [float] NOT NULL,
    [FACT_average] [float] NOT NULL,

    CONSTRAINT [PK_FACT_TABLE] 
        PRIMARY KEY CLUSTERED ([A_id] ASC, [B_id] ASC,
                               [C_id] ASC, [D_id] ASC)
) ON [PRIMARY]

I have a T-SQL Script to load data into my fact table:

INSERT INTO [dbo].[FACT_TABLE] ([A_id], [B_id], [C_id], [company_id], [D_id],[FACT_total], [FACT_average])
    SELECT 
        b.A_id, c.B_id, d.C_id, e.D_id,
        AVG(FACT_total) AS FACT_total,
        AVG(FACT_average) AS FACT_average
    FROM 
        [dbo].[staging_Area] a
    INNER JOIN 
        [dbo].[DIM_A] b ON a.A_id = b.A_id
    INNER JOIN 
        [dbo].[DIM_B] c ON a.[B_id] = c.B_id
    INNER JOIN 
        [dbo].[DIM_C] d ON a.[C_id] = d.C_id 
    INNER JOIN 
        [dbo].[DIM_D] e ON a.D_id = e.D_id
    GROUP BY  
        b.A_id, c.B_id, d.C_id, e.D_id 

How can I guarantee that I don't insert duplicate values in my fact table in order to not get referential integrity error?

I tried with the following statement after that query:

EXCEPT 
SELECT *
FROM [dbo].[FACT_TABLE] 

But I think that isn't the best approach for this...

Anyone have another alternative for this? Sorry but I can't put the real column and tables names...

Thanks!


Solution

  • LEFT OUTER JOIN with WHERE FT.Column IS NULL to Fact table to ensure you're not inserting a record that already exists:

    INSERT INTO [dbo].[FACT_TABLE] ([A_id],[B_id],[C_id],[company_id],[D_id],[FACT_total],[FACT_average])
    SELECT b.A_id
            ,c.B_id
            ,d.C_id
            ,e.D_id
            ,AVG(FACT_total) AS FACT_total
            ,AVG(FACT_average) AS FACT_average
    FROM [dbo].[staging_Area] a
    INNER JOIN [dbo].[DIM_A] b ON
        a.A_id = b.A_id
    INNER JOIN [dbo].[DIM_B] c ON
        a.[B_id] = c.B_id
    INNER JOIN [dbo].[DIM_C] d ON
        a.[C_id] = d.C_id 
    INNER JOIN [dbo].[DIM_D] e ON
        a.D_id = e.D_id
    LEFT OUTER JOIN dbo.[FACT_TABLE] FT
        ON b.A_ID = FT.A_id
        AND c.B_id = FT.B_id
        AND D.C_ID = FT.C_id
        AND E.D_id = FT.D_id
    WHERE FT.A_ID IS NULL
    GROUP BY  b.A_id
            ,c.B_id
            ,d.C_id
            ,e.D_id