Search code examples
sqlsql-serverperformancequery-optimization

Why are columns not in my SELECT coming in the output when looking at the execution plan?


I am running the below query and the below index is present to satisfy the query.

SELECT top 100 RCLNT, RBUKRS, RYEAR, RACCT, DRCRK, SUM(ISNULL([000],0.00)) AS CSLVT,SUM(ISNULL([001],0.00)) AS CSL01,SUM(ISNULL([002],0.00)) AS CSL02,SUM(ISNULL([003],0.00)) AS CSL03,SUM(ISNULL([004],0.00)) AS CSL04,SUM(ISNULL([005],0.00)) AS CSL05,SUM(ISNULL([006],0.00)) AS CSL06,SUM(ISNULL([007],0.00)) AS CSL07,SUM(ISNULL([008],0.00)) AS CSL08,SUM(ISNULL([009],0.00)) AS CSL09,SUM(ISNULL([010],0.00)) AS CSL10,SUM(ISNULL([011],0.00)) AS CSL11,SUM(ISNULL([012],0.00)) AS CSL12,SUM(ISNULL([013],0.00)) AS CSL13,SUM(ISNULL([014],0.00)) AS CSL14,SUM(ISNULL([015],0.00)) AS CSL15,SUM(ISNULL([016],0.00)) AS CSL16
FROM dbo.ACDOCA WITH (NOLOCK)

PIVOT
(
MAX(ACDOCA.CSL) FOR ACDOCA.POPER IN ([000],[001],[002],[003],[004],[005],[006],[007],[008],[009],[010],[011],[012],[013],[014],[015],[016])
) AS ACDOC_PIV
GROUP BY RCLNT, RBUKRS, RYEAR, RACCT, DRCRK

Index:

CREATE NONCLUSTERED INDEX [IX_ACDOCA_RCLNT_RBUKRS_RYEAR_RACCT_DRCRK_INCL_CSL_POPER] ON [dbo].[ACDOCA]
(
    [RCLNT] ASC,
    [RBUKRS] ASC,
    [RYEAR] ASC,
    [RACCT] ASC,
    [DRCRK] ASC
)
INCLUDE([CSL],[POPER]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]

But, for some reason SQL Server is doing a Clustered Key lookup to fetch all the other columns in the table. Is there any reason why it's doing that and can it be avoided to improve performance?

Execution plan screenshot: enter image description here


Solution

  • As Itzik Ben Gan explains here

    The design of the PIVOT operator requires you to explicitly specify the aggregation and spreading elements, but lets SQL Server implicitly figure out the grouping element by elimination. Whichever [remaining] columns appear in the source table that is provided as the input to the PIVOT operator, they implicitly become the grouping element.

    If you use a table expression that only projects the relevant columns and PIVOT that instead you avoid the issue with unwanted/unexpected columns participating in the implicit GROUP BY. You should also get rid of the explicit GROUP BY and the SUM as then the implicit GROUP BY columns will be the same as your explicit grouping anyway so this was only needed due to the presence of those columns adding additional unwanted groups.

    So you can use (Fiddle)

    WITH PivotSource
         AS (SELECT RCLNT,
                    RBUKRS,
                    RYEAR,
                    RACCT,
                    DRCRK,
                    CSL,
                    POPER
             FROM   ACDOCA)
    SELECT TOP 100 RCLNT,
                   RBUKRS,
                   RYEAR,
                   RACCT,
                   DRCRK,
                   ISNULL([000], 0.00) AS CSLVT,
                   ISNULL([001], 0.00) AS CSL01,
                   ISNULL([002], 0.00) AS CSL02,
                   ISNULL([003], 0.00) AS CSL03,
                   ISNULL([004], 0.00) AS CSL04,
                   ISNULL([005], 0.00) AS CSL05,
                   ISNULL([006], 0.00) AS CSL06,
                   ISNULL([007], 0.00) AS CSL07,
                   ISNULL([008], 0.00) AS CSL08,
                   ISNULL([009], 0.00) AS CSL09,
                   ISNULL([010], 0.00) AS CSL10,
                   ISNULL([011], 0.00) AS CSL11,
                   ISNULL([012], 0.00) AS CSL12,
                   ISNULL([013], 0.00) AS CSL13,
                   ISNULL([014], 0.00) AS CSL14,
                   ISNULL([015], 0.00) AS CSL15,
                   ISNULL([016], 0.00) AS CSL16
    FROM   PivotSource
    PIVOT ( MAX(CSL)
          FOR POPER IN ([000],[001],[002],[003],[004],[005],[006],[007],[008],[009],[010],[011],[012],[013],[014],[015],[016]) ) AS ACDOC_PIV
    --ORDER BY /*Todo: TOP 100 ordered by what?*/