Search code examples
sqlsql-servert-sqlindexingclustered-index

Clustered vs Nonclustered for a Reference table


I have a simple product table that keeps track of product data. Most of the time i don't need to know what type of product it is, but every once in awhile i need the product type. Now since not all products even have a type (which results in a lot of NULL rows), i use a reference table to join the product type when i need that info. The reference table uses a composite key and what I'm trying to figure out is should the primary key be a cluster index or a non clustered index. The product table has a clustered index for its primary key, so i was wondering if the join would be more efficient if it was also a clustered index ( so that the order of the id's are in order). Or is this ignored during the join and thus the nonclustered would be more efficient since it doesn't do a key lookup?

CREATE TABLE [dbo].[sales_product_type]
(
    [FK_product_id] [int] NOT NULL,
    [product_type] [int] NOT NULL,
    [type_description] [nvarchar](max) NULL,

    CONSTRAINT [PK_sales_product_type] 
        PRIMARY KEY CLUSTERED ([FK_product_id] ASC, [product_type] 
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[sales_product]
(
    [product_id] [int] IDENTITY(1,1) NOT NULL,
    [FK_store_id] [int] NOT NULL,
    [price] [int] NOT NULL,
    [product_name] [nvarchar](max) NOT NULL,
    [units] [int] NULL,

    CONSTRAINT [PK_sales_product] 
        PRIMARY KEY CLUSTERED ([product_id] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

Solution

  • If you need [type_description] column when you are querying for product type, you should go with the clustered index. The reason being that the clustered index will have all columns of the table (including the key columns Product ID and Product Type).

    On the other hand, if you had only a non-clustered index on Product ID and Product Type, when your query requires to fetch the type_description it would have to do a Heap Lookup for every type in the result data set.

    So if you need type_description in the result, you should keep a clustered index.


    But, in your particular scenario, it won't matter if the type_description is larger than 8000 characters. As discussed here (and here), the column's value would be stored out-of-row if it exceeds 8000 characters. So in any event the engine would have to perform a lookup to get that value.


    If you are not going to query type_description that often, using a non-clustered index might result in much lower reads - as the engine doesn't have to go over the type_description field. But I would test out both approaches before deciding on one.

    In general, I would always have a clustered index on the table. If required, I might add a non-clustered index to tune particular queries.