Search code examples
azure-sql-databasedatabase-partitioningazure-synapse

Azure SQL DWH Date Partitioning


Just keen to see a working example of how to partition a large table of (150 million rows with 30 columns), what are the best practices to partition such a big table by date (sample code please) Also, want to know how are these partitions are merged, switched out, and archived. Any TSQL code based implementation example is much appreciated.


Solution

  • Below how you partition a table by a date field:

    CREATE TABLE [dbo].[FactInternetSales]
    (
        [ProductKey]            int          NOT NULL
    ,   [OrderDateKey]          int          NOT NULL
    ,   [CustomerKey]           int          NOT NULL
    ,   [PromotionKey]          int          NOT NULL
    ,   [SalesOrderNumber]      nvarchar(20) NOT NULL
    ,   [OrderQuantity]         smallint     NOT NULL
    ,   [UnitPrice]             money        NOT NULL
    ,   [SalesAmount]           money        NOT NULL
    )
    WITH
    (   CLUSTERED COLUMNSTORE INDEX
    ,   DISTRIBUTION = HASH([ProductKey])
    ,   PARTITION   (   [OrderDateKey] RANGE RIGHT FOR VALUES
                        (20000101,20010101,20020101
                        ,20030101,20040101,20050101
                        )
                    )
    )
    ;
    

    Below is a sample partitioned columnstore table containing one row in each partition:

    CREATE TABLE [dbo].[FactInternetSales]
    (
            [ProductKey]            int          NOT NULL
        ,   [OrderDateKey]          int          NOT NULL
        ,   [CustomerKey]           int          NOT NULL
        ,   [PromotionKey]          int          NOT NULL
        ,   [SalesOrderNumber]      nvarchar(20) NOT NULL
        ,   [OrderQuantity]         smallint     NOT NULL
        ,   [UnitPrice]             money        NOT NULL
        ,   [SalesAmount]           money        NOT NULL
    )
    WITH
    (   CLUSTERED COLUMNSTORE INDEX
    ,   DISTRIBUTION = HASH([ProductKey])
    ,   PARTITION   (   [OrderDateKey] RANGE RIGHT FOR VALUES
                        (20000101
                        )
                    )
    )
    ;
    
    INSERT INTO dbo.FactInternetSales
    VALUES (1,19990101,1,1,1,1,1,1);
    INSERT INTO dbo.FactInternetSales
    VALUES (1,20000101,1,1,1,1,1,1);
    
    
    CREATE STATISTICS Stat_dbo_FactInternetSales_OrderDateKey ON dbo.FactInternetSales(OrderDateKey);
    

    SQL Data Warehouse supports partition splitting, merging, and switching. Each of these functions is excuted using the ALTER TABLE statement.

    To create a partitioned table on Azure SQL Data Warehouse from data coming from another table, you can use CTAS as shown below:

    CREATE TABLE dbo.FactInternetSales_20000101
        WITH    (   DISTRIBUTION = HASH(ProductKey)
                ,   CLUSTERED COLUMNSTORE INDEX
                ,   PARTITION   (   [OrderDateKey] RANGE RIGHT FOR VALUES
                                    (20000101
                                    )
                                )
                )
    AS
    SELECT *
    FROM    FactInternetSales
    WHERE   1=2
    ;
    

    For more information, please visit this documentation.