Search code examples
sqlsql-serverindexingnon-clustered-index

If a table has 'id' column as it's clustered index, is there any benefit in adding 'id' column as an included column in any other non-clustered index


If a table has 'id' (the primary key) column as its clustered index, is there any benefit in adding 'id' column as an included column in any other non-clustered index in Microsoft SQLServer?

eg:- Table 'xyz'

id name status date
1 abc active 2021-06-23
CREATE NONCLUSTERED INDEX [NonClusteredIndex_status_Date]
                ON [xyz]
                    (   
                        [status] ASC,
                        [date] ASC
                    )
                INCLUDE
                    (   [id],
                        [name]
                    ) 

And this non-clustered index is targeted for a query similar to bellow on a large data set. In the actual case there could be some other queries as well.

select * from xyz where status='active' and date > '2021-06-20'

Solution

  • This answer assumes you are planning to run the following query:

    SELECT * FROM xyz WHERE status = 'active' AND date > '2021-06-20';
    

    If you only created a non clustered index on (status, date), then it would cover the WHERE clause, but not the SELECT clause. What this means is that SQL Server might choose to use the index to find the matching records in the query. But when it gets to evaluating the SELECT clause, it would be forced to seek back to the clustered index to find the values for the columns not included in the index, other than the id clustered index column (which includes the name column in this case). There is performance penalty in doing this, and SQL Server might, depending on your data, even choose to not use the index because it does not completely cover the query.

    To mitigate this, you can define the index in your question, where you include the name value in the leaf nodes. Note that id will be included by default in the index, so we do not need to explicitly INCLUDE it. By following this approach, the index itself is said to completely cover the query, meaning that SQL Server may use the index for the entire query plan. This can lead to fast performance in many cases.