Search code examples
sql-serverindexingsql-server-2016data-warehousecolumnstore

SQL Replace All Tables with Clustered Columnstore Index


We are conducting a migration project, and looking to replace most Rowstore indexes with Clustered Columnstore indexes for large Data Warehouse. We are adding a unique index on the identity column.

Does anyone have script to alter run through all 100+ tables, and replace primary key clustered index with Columnstore Index?

Testing to see if columnstore index will help performance tuning if we migrate.

*By the way, are Identity columns recommended in Columnstore? Need way to identify each row (sometimes identity columns may reset, so placing unique constraint, Or should Guids be used?) If not identity, feel free to remove identity column with Guid or something else .

Current:

CREATE TABLE [dbo].[Fact_SalesTransaction] 
(
    [FactSalesTransactionId]        INT              IDENTITY (1, 1) NOT NULL,
    [DimCustomerId]                 INT              NOT NULL,
    [DimSellerId]                   INT              NOT NULL,
    [SalesDatetime]                 DATETIME         NULL,
    [DimSalesDateId]                INT              NULL,
    [SalesAmount]                   DECIMAL (28, 2)  NULL,
    [ETLCreateDate]                 DATETIME         NULL,
    CONSTRAINT [pk_SalesTransactionId] PRIMARY KEY CLUSTERED ([SalesTransactionId] ASC)
);

Expected:

CREATE TABLE [dbo].[Fact_SalesTransaction] 
(
    [FactSalesTransactionId]        INT              IDENTITY (1, 1) NOT NULL,
    [DimCustomerId]                 INT              NOT NULL,
    [DimSellerId]                   INT              NOT NULL,
    [SalesDatetime]                 DATETIME         NULL,
    [DimSalesDateId]                INT              NULL,
    [SalesAmount]                   DECIMAL (28, 2)  NULL,
    [ETLCreateDate]                 DATETIME         NULL,
);


CREATE CLUSTERED COLUMNSTORE INDEX ccx_Fact_SalesTransaction ON Fact_SalesTransaction;  

CREATE UNIQUE INDEX unx_FactSalesTransactionId ON dbo.Fact_SalesTransaction (FactSalesTransactionId);  

We only want to use T-SQL to conduct this on an existing database.

Helpful Resource in Comment: Generate SQL Create Scripts for existing tables with Query


Solution

  • Devart answered this question asking about how to generate a script for a table. I tweaked his code to omit the primary key portion and replaced it with unique key and columnstore creation scripts following his general pattern. I also had to use marc_s's answer to this question about how to find out if a table's primary key is clustered or not. I put all this into a stored procedure called #scriptTable. This is what's below:

    create procedure #scriptTable
        @table_name sysname,
        @sql nvarchar(max) output
    as
    
    DECLARE 
          @object_name SYSNAME
        , @object_id INT
    
    SELECT 
          @object_name = '[' + s.name + '].[' + o.name + ']'
        , @object_id = o.[object_id]
    FROM sys.objects o WITH (NOWAIT)
    JOIN sys.schemas s WITH (NOWAIT) ON o.[schema_id] = s.[schema_id]
    WHERE s.name + '.' + o.name = @table_name
        AND o.[type] = 'U'
        AND o.is_ms_shipped = 0
    
    ;WITH index_column AS 
    (
        SELECT 
              ic.[object_id]
            , ic.index_id
            , ic.is_descending_key
            , ic.is_included_column
            , c.name
        FROM sys.index_columns ic WITH (NOWAIT)
        JOIN sys.columns c WITH (NOWAIT) ON ic.[object_id] = c.[object_id] AND ic.column_id = c.column_id
        WHERE ic.[object_id] = @object_id
    ),
    fk_columns AS 
    (
         SELECT 
              k.constraint_object_id
            , cname = c.name
            , rcname = rc.name
        FROM sys.foreign_key_columns k WITH (NOWAIT)
        JOIN sys.columns rc WITH (NOWAIT) ON rc.[object_id] = k.referenced_object_id AND rc.column_id = k.referenced_column_id 
        JOIN sys.columns c WITH (NOWAIT) ON c.[object_id] = k.parent_object_id AND c.column_id = k.parent_column_id
        WHERE k.parent_object_id = @object_id
    )
    SELECT @sql = 'CREATE TABLE ' + @object_name + CHAR(13) + '(' + CHAR(13) + STUFF((
        SELECT CHAR(9) + ', [' + c.name + '] ' + 
            CASE WHEN c.is_computed = 1
                THEN 'AS ' + cc.[definition] 
                ELSE UPPER(tp.name) + 
                    CASE WHEN tp.name IN ('varchar', 'char', 'varbinary', 'binary', 'text')
                           THEN '(' + CASE WHEN c.max_length = -1 THEN 'MAX' ELSE CAST(c.max_length AS VARCHAR(5)) END + ')'
                         WHEN tp.name IN ('nvarchar', 'nchar', 'ntext')
                           THEN '(' + CASE WHEN c.max_length = -1 THEN 'MAX' ELSE CAST(c.max_length / 2 AS VARCHAR(5)) END + ')'
                         WHEN tp.name IN ('datetime2', 'time2', 'datetimeoffset') 
                           THEN '(' + CAST(c.scale AS VARCHAR(5)) + ')'
                         WHEN tp.name = 'decimal' 
                           THEN '(' + CAST(c.[precision] AS VARCHAR(5)) + ',' + CAST(c.scale AS VARCHAR(5)) + ')'
                        ELSE ''
                    END +
                    CASE WHEN c.collation_name IS NOT NULL THEN ' COLLATE ' + c.collation_name ELSE '' END +
                    CASE WHEN c.is_nullable = 1 THEN ' NULL' ELSE ' NOT NULL' END +
                    CASE WHEN dc.[definition] IS NOT NULL THEN ' DEFAULT' + dc.[definition] ELSE '' END + 
                    CASE WHEN ic.is_identity = 1 THEN ' IDENTITY(' + CAST(ISNULL(ic.seed_value, '0') AS CHAR(1)) + ',' + CAST(ISNULL(ic.increment_value, '1') AS CHAR(1)) + ')' ELSE '' END 
            END + CHAR(13)
        FROM sys.columns c WITH (NOWAIT)
        JOIN sys.types tp WITH (NOWAIT) ON c.user_type_id = tp.user_type_id
        LEFT JOIN sys.computed_columns cc WITH (NOWAIT) ON c.[object_id] = cc.[object_id] AND c.column_id = cc.column_id
        LEFT JOIN sys.default_constraints dc WITH (NOWAIT) ON c.default_object_id != 0 AND c.[object_id] = dc.parent_object_id AND c.column_id = dc.parent_column_id
        LEFT JOIN sys.identity_columns ic WITH (NOWAIT) ON c.is_identity = 1 AND c.[object_id] = ic.[object_id] AND c.column_id = ic.column_id
        WHERE c.[object_id] = @object_id
        ORDER BY c.column_id
        FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, CHAR(9) + ' ')
        + ')'  + CHAR(13)
        + ISNULL((SELECT (
            SELECT CHAR(13) +
                 'ALTER TABLE ' + @object_name + ' WITH' 
                + CASE WHEN fk.is_not_trusted = 1 
                    THEN ' NOCHECK' 
                    ELSE ' CHECK' 
                  END + 
                  ' ADD CONSTRAINT [' + fk.name  + '] FOREIGN KEY(' 
                  + STUFF((
                    SELECT ', [' + k.cname + ']'
                    FROM fk_columns k
                    WHERE k.constraint_object_id = fk.[object_id]
                    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
                   + ')' +
                  ' REFERENCES [' + SCHEMA_NAME(ro.[schema_id]) + '].[' + ro.name + '] ('
                  + STUFF((
                    SELECT ', [' + k.rcname + ']'
                    FROM fk_columns k
                    WHERE k.constraint_object_id = fk.[object_id]
                    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
                   + ')'
                + CASE 
                    WHEN fk.delete_referential_action = 1 THEN ' ON DELETE CASCADE' 
                    WHEN fk.delete_referential_action = 2 THEN ' ON DELETE SET NULL'
                    WHEN fk.delete_referential_action = 3 THEN ' ON DELETE SET DEFAULT' 
                    ELSE '' 
                  END
                + CASE 
                    WHEN fk.update_referential_action = 1 THEN ' ON UPDATE CASCADE'
                    WHEN fk.update_referential_action = 2 THEN ' ON UPDATE SET NULL'
                    WHEN fk.update_referential_action = 3 THEN ' ON UPDATE SET DEFAULT'  
                    ELSE '' 
                  END 
                + CHAR(13) + 'ALTER TABLE ' + @object_name + ' CHECK CONSTRAINT [' + fk.name  + ']' + CHAR(13)
            FROM sys.foreign_keys fk WITH (NOWAIT)
            JOIN sys.objects ro WITH (NOWAIT) ON ro.[object_id] = fk.referenced_object_id
            WHERE fk.parent_object_id = @object_id
            FOR XML PATH(N''), TYPE).value('.', 'NVARCHAR(MAX)')), '')
        + ISNULL(((SELECT
             CHAR(13) + 'CREATE' + CASE WHEN i.is_unique = 1 THEN ' UNIQUE' ELSE '' END 
                    + ' NONCLUSTERED INDEX [' + i.name + '] ON ' + @object_name + ' (' +
                    STUFF((
                    SELECT ', [' + c.name + ']' + CASE WHEN c.is_descending_key = 1 THEN ' DESC' ELSE ' ASC' END
                    FROM index_column c
                    WHERE c.is_included_column = 0
                        AND c.index_id = i.index_id
                    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') + ')'  
                    + ISNULL(CHAR(13) + 'INCLUDE (' + 
                        STUFF((
                        SELECT ', [' + c.name + ']'
                        FROM index_column c
                        WHERE c.is_included_column = 1
                            AND c.index_id = i.index_id
                        FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') + ')', '')  + CHAR(13)
            FROM sys.indexes i WITH (NOWAIT)
            WHERE i.[object_id] = @object_id
                AND i.is_primary_key = 0
                AND i.[type] = 2
            FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)')
        ), '')
        + char(13)
    
    -- Replaced "create primary key" logic in the original with what's below:
    
        +   ISNULL(
                (
                    select  'create clustered columnstore index [ccx_' + @table_name + '] on ' + @object_name + char(13) +
                            'create unique ' + convert(nvarchar(max), i.type_desc) + ' index [' + replace(k.name, 'pk_', 'unx_') + '] ' +
                                'on ' + @object_name + ' (' + 
                                    (
                                        SELECT STUFF((
                                            SELECT  ', [' + ic.name + '] ' + CASE WHEN ic.is_descending_key = 1 THEN 'DESC' ELSE 'ASC' END
                                            FROM    index_column ic
                                            WHERE   ic.is_included_column = 0
                                            AND     ic.[object_id] = k.parent_object_id 
                                            AND     ic.index_id = k.unique_index_id     
                                            FOR XML PATH(N''), TYPE
                                        ).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
                                    ) + 
                                ')' + CHAR(13)
                    FROM sys.key_constraints k WITH (NOWAIT)
                    join sys.indexes i with (nowait) on k.unique_index_id = i.index_id and k.parent_object_id = i.object_id
                    WHERE k.parent_object_id = @object_id 
                        AND k.[type] = 'PK'
                ), 
                ''
            );
    

    You can use #scriptTable like this:

    declare @sql nvarchar(max);
    exec #scriptTable 'dbo.Fact_SalesTransaction', @sql output;
    print (@sql);
    

    Replace 'print' with 'exec' or use sp_executeSql when ready.

    To use it on all tables, first capture the tables you want to modify:

    declare @tables table (
        rowId int identity(1,1),
        name nvarchar(max)
    );
    
    insert      @tables
    select      schema_name(schema_id) + '.' + name 
    from        sys.tables
    where       type_desc = 'user_table'
    

    Now you're ready to loop the tables and apply #scriptTable:

    declare @rowId int = 1;
    declare @table nvarchar(max);
    declare @sql nvarchar(max);
    
    while @rowId <= (select max(rowId) from @tables) begin
        select @table = name from @tables where rowId = @rowId;
        exec #scriptTable @table, @sql output;
        print (@sql); -- turn 'print' into 'exec' or otherwise use sp_executeSql
        set @rowId += 1;
    end
    

    As before, replace 'print' with 'exec' or use sp_executeSql when ready.

    Note that my portion may need further modification if you have table names that will require brackets around them.

    EDIT: Updated the code a bit to simplify (slightly) and to work with table names needing brackets around them.