Search code examples
sqlsql-serverdatabaserdbms

select case performance


Assume that we have this table:

|------------------------------|
| col           | data type    | 
|------------------------------|
| id            | bigint       |
| first_name    | nvarchar     |
| last_name     | nvarchar     |
| photo         | binary       |
|------------------------------|

ok now we want two queries for this table:

  • list all table rows without photo col(load it as null)
  • load specific row with photo col

For this we want to use SP

The first approach going to be like this:

SELECT Top 50
    id,
    first_name,
    last_name,
    CASE WHEN @id IS NULL THEN null ELSE photo END as photo
FROM MyTable
WHERE @id IS NULL OR id=@id

and the second approach is:

IF (@id IS NULL)
    SELECT Top 50
        id,
        first_name,
        last_name,
        null as photo
    FROM MyTable
ELSE
    SELECT 
        id,
        first_name,
        last_name,
        photo
    FROM MyTable
    WHERE id=@id
END

The second approach obviously check the if condition one time and do the work.

But I like to write my SPs like first approach and I don't know if sql server going to check the CASE WHEN @id IS NULL for each row or the query optimizer going to optimize the query without CASE?

Edit 1:

I want to know if the first query would always executed without case or not? because it is checking a variable that doesn't change value and its not related to any column value.


Solution

  • Since some answers were posted, I decided to check it out for myself.

    Setup on SQL Server 2014

    -- table is not identical to yours, but it should do.
    create table t1 (
      id int identity primary key,
      first_name varchar(50),
      last_name varchar(50)
    )
    go
    
    -- insert ~10,000,000 rows of randomly generated data.
    with cte as (
      select 1 as rn, newid() as first_name, newid() as last_name
      union all
      select t.rn + 1 as rn, newid() as first_name, newid() as last_name
        from cte t
       where t.rn < 10000000
    )
    insert into t1 (first_name, last_name)
    select first_name, last_name
    from cte
    option (maxrecursion 0)
    go
    
    update statistics
    go
    

    Query #1 - OP's unified query approach:

    declare @id int = 5000000
    
    SELECT Top 50
        id,
        first_name,
        last_name
    FROM t1
    WHERE @id IS NULL OR id = @id
    go
    

    Execution time: 13 seconds

    Execution plan:

    enter image description here

    Query #2 - OP's query in the ELSE clause:

    declare @id int = 5000000
    
    SELECT 
       id,
       first_name,
       last_name
    FROM t1
    WHERE id=@id
    go
    

    Execution time: 0 seconds

    Execution plan:

    enter image description here

    Query #3 - @Ann's query:

    declare @id int = 5000000
    
    SELECT Top 50
           id,
           first_name,
           last_name
    FROM   t1
    WHERE  @id IS NULL
    UNION ALL
    SELECT id,
           first_name,
           last_name
    FROM   t1
    WHERE  id = @id
    go
    

    Execution time: 0 seconds

    Execution plan:

    enter image description here

    Query #4 - @Hogan's query:

    declare @id int = 5000000
    
    SELECT Top 50
           id,
           first_name,
           last_name
    FROM   t1
    WHERE  id = COALESCE(@id, id)
    go
    

    Execution time: 14 seconds

    Execution plan:

    enter image description here

    Conclusion

    Both your attempt, and Hogan's, to unify the logic in a single query don't perform as well as having a separate query for the case where @id has a value. In both cases, you can see that the query essentially scans the whole clustered index, rather than performing the more straight forward index seek.

    Interestingly, though Ann's execution plan appears to be the most complex of all, the performance suggests that the optimizer was somehow able to detect that @id was not null and short-circuit the expensive cluster index scan in that case.

    Still, since you're already inside a stored procedure, it seems to me that you should stick with the if-else approach. Seems like the safest way to go.