Search code examples
sqlperformancesql-server-2005sql-execution-plan

SQL select field causing major performance problem


I have a stored procedure that joins in numerous tables and selects fields from them. One of the tables being a temporary table.

SELECT
    a.Field1,
    a.Field2,
    b.Field3,
    b.Field4,
    c.Field5
FROM table1 a
    LEFT JOIN #table2 b ON a.Field1 = b.Field1
    INNER JOIN table3 c ON a.Field1 = c.Field1

The above takes 10+ minutes, however if I comment out the two b fields from the select while leaving the join in place it runs in just seconds.

I have pulled this out of procedure to simplify and same behavior. Also the execution plans are almost identical.

Any help is appreciated.


Solution

  • How many rows are in the temp table, and is "Field2" in the temp table a primary key?

    If you're not selecting any rows from the right table of a left join, and the join is to the primary key (or possibly a unique key), and you reference no columns from the right table, SQL Server can avoid having to access the temp table at all (since the presence or absence of a joining row has no impact on the final result):

    Example. Table setup:

    create table T1 (
        ID int not null primary key,
        Col1 varchar(10) not null
    )
    go
    insert into T1 (ID,Col1)
    select 1,'a' union all
    select 2,'b' union all
    select 3,'c'
    go
    create table #t2 (
        ID int not null primary key,
        Col2 varchar(10) not null
    )
    go
    insert into #t2 (ID,Col2)
    select 1,'d' union all
    select 2,'e' union all
    select 4,'f'
    go
    create table #t3 (
        ID int not null,
        Col3 varchar(10) not null
    )
    go
    insert into #t3 (ID,Col3)
    select 1,'d' union all
    select 2,'e' union all
    select 1,'f'
    

    And the queries:

    select T1.ID,T1.Col1 from T1 left join #t2 t2 on T1.ID = t2.ID
    select T1.ID,T1.Col1,t2.Col2 from T1 left join #t2 t2 on T1.ID = t2.ID
    select T1.ID,T1.Col1 from T1 left join #t3 t3 on T1.ID = t3.ID
    select T1.ID,T1.Col1,t3.Col2 from T1 left join #t2 t3 on T1.ID = t3.ID
    

    In all but the first query, the join happens as expected. But because the presence or absence of rows in #t2 can't affect the final result for the first query, it avoids performing the join entirely.

    But if it's not something like that (and I'd expect it to be an obvious difference in the query plans)< I@m a bit stumped.