Search code examples
sql-servert-sqlsql-server-2014

Does SQL Server allow including a computed column in a non-clustered index? If not, why not?


When a column is included in non-clustered index, SQL Server copies the values for that column from the table into the index structure (B+ tree). Included columns don't require table look up.

If the included column is essentially a copy of original data, why does not SQL Server also allow including computed columns in the non-clustered index - applying the computations when it is copying/updating the data from table to index structure? Or am I just not getting the syntax right here?

Assume:

  • DateOpened is datetime
  • PlanID is varchar(6)

This works:

create nonclustered index ixn_DateOpened_CustomerAccount
on dbo.CustomerAccount(DateOpened)
include(PlanID)

This does not work with left(PlanID, 3):

create nonclustered index ixn_DateOpened_CustomerAccount
on dbo.CustomerAccount(DateOpened)
include(left(PlanID, 3))

or

create nonclustered index ixn_DateOpened_CustomerAccount
on dbo.CustomerAccount(DateOpened)
include(left(PlanID, 3) as PlanType)

My use case is somewhat like below query.

select 
    case 
        when left(PlanID, 3) = '100' then 'Basic'
        else 'Professional'
    end as 'PlanType'
from 
    CustomerAccount
where
    DateOpened between '2016-01-01 00:00:00.000' and '2017-01-01 00:00:00.000'

The query cares only for the left 3 of PlanID and I was wondering instead of computing it every time the query runs, I would include left(PlanID, 3) in the non-clustered index so the computations are done when the index is built/updated (fewer times) instead at the query time (frequently)

EDIT: We use SQL Server 2014.


Solution

  • As Laughing Vergil stated - you CAN index persisted columns provided that they are persisted. You have a few options, here's a couple:

    Option 1: Create the column as PERSISTED then index it

    (or, in your case, include it in the index)

    First the sample data:

    CREATE TABLE dbo.CustomerAccount
    (
      PlanID int PRIMARY KEY, 
      DateOpened datetime NOT NULL,
      First3 AS LEFT(PlanID,3) PERSISTED
    );
    INSERT dbo.CustomerAccount (PlanID, DateOpened)
    VALUES (100123, '20160114'), (100999, '20151210'), (255657, '20150617');
    

    and here's the index:

    CREATE NONCLUSTERED INDEX nc_CustomerAccount ON dbo.CustomerAccount(DateOpened) 
    INCLUDE (First3);
    

    Now let's test:

    -- Note: IIF is available for SQL Server 2012+ and is cleaner
    SELECT PlanID, PlanType = IIF(First3 = 100, 'Basic', 'Professional')
    FROM dbo.CustomerAccount;
    

    Execution Plan: enter image description here As you can see- the optimizer picked the nonclustered index.

    Option #2: Perform the CASE logic inside your table DDL First the updated table structure:

    DROP TABLE dbo.CustomerAccount;
    CREATE TABLE dbo.CustomerAccount
    (
      PlanID int PRIMARY KEY, 
      DateOpened datetime NOT NULL,
      PlanType AS 
        CASE -- NOTE: casting as varchar(12) will make the column a varchar(12) column:
          WHEN LEFT(PlanID,3) = 100 THEN CAST('Basic' AS varchar(12))
          ELSE 'Professional' 
        END 
      PERSISTED
    );
    INSERT dbo.CustomerAccount (PlanID, DateOpened)
    VALUES (100123, '20160114'), (100999, '20151210'), (255657, '20150617');
    

    Notice that I use CAST to assign the data type, the table will be created with this column as varchar(12).

    Now the index:

    CREATE NONCLUSTERED INDEX nc_CustomerAccount ON dbo.CustomerAccount(DateOpened) 
    INCLUDE (PlanType);
    

    Let's test again:

    SELECT DateOpened, PlanType FROM dbo.CustomerAccount;
    

    Execution plan:

    enter image description here

    ... again, it used the nonclustered index

    A third option, which I don't have time to go into, would be to create an indexed view. This would be a good option for you if you were unable to change your existing table structure.