Search code examples
sqlt-sqloptimizationsql-execution-planlocal-variables

Big difference in Estimated and Actual rows when using a local variable


This is my first post on Stackoverflow so I hope I'm correctly following all protocols!

I'm struggling with a stored procedure in which I create a table variable and filling this table with an insert statement using an inner join. The insert itself is simple, but it gets complicated because the inner join is done on a local variable. Since the optimizer doesn't have statistics for this variable my estimated row count is getting srewed up.

The specific piece of code that causes trouble:

declare @minorderid int
select @minorderid = MIN(lo.order_id)
from [order] lo with(nolock)
where lo.order_datetime >= @datefrom

insert into @OrderTableLog_initial
(order_id, order_log_id, order_id, order_datetime, account_id, domain_id)

    select ot.order_id, lol.order_log_id, ot.order_id, ot.order_datetime, ot.account_id, ot.domain_id
    from [order] ot with(nolock)

inner join order_log lol with(nolock)
on ot.order_id = lol.order_id
and ot.order_datetime >= @datefrom

where (ot.domain_id in (1,2,4) and lol.order_log_id not in ( select order_log_id 
                                        from dbo.order_log_detail lld with(nolock)
                                        where order_id >= @minorderid
                                    )
or
(ot.domain_id = 3 and ot.order_id not IN (select order_id 
                                        from dbo.order_log_detail_spa llds with(nolock)
                                        where order_id >= @minorderid
                                        )
))

order by lol.order_id, lol.order_log_id

The @datefrom local variable is also declared earlier in the stored procedure:

    declare @datefrom datetime
    if datepart(hour,GETDATE()) between 4 and 9
    begin
        set @datefrom = '2011-01-01'
    end
    else
    begin
        set @datefrom = DATEADD(DAY,-2,GETDATE())   
    end

I've also tested this with a temporary table in stead of a table variable, but nothing changes. However, when I replace the local variable >= @datefrom with a fixed datestamp then my estimates and actuals are almost the same.

ot.order_datetime >= @datefrom = SQL Sentry Plan Explorer

ot.order_datetime >= '2017-05-03 18:00:00.000' = SQL Sentry Plan Explorer

I've come to understand that there's a way to fix this by turning this code into a dynamic sp, but I'm not sure how to do this. I would be grateful if someone could give me suggestions on how to do this. Maybe I have to use a complete other approach? Forgive me if I forgot something to mention, this is my first post.

EDIT:

MSSQL version = 11.0.5636

I've also tested with trace flag 2453, but with no success

Best regards, Peter


Solution

  • Indeed, the behavior what you are experiencing is because of the variables. SQL Server won't store an execution plan for each and every possible inputs, thus for some queries the execution plan may or may not be optimal.

    To answer your question: You'll have to create an nvarchar variable and build the query as a string, then execute it.

    Some notes before the actual code:

    • This can be prone to SQL injection (in general)
    • SQL Server will store the plans separately, meaning they will use more memory and possibly knock out other plans from the cache

    Using an imaginary setup, this is what you want to do:

    DECLARE @inputDate DATETIME2 = '2017-01-01 12:21:54';
    
    DELCARE @dynamiSQL NVARCHAR(MAX) = CONCAT('SELECT col1, col2 FROM MyTable WHERE myDateColumn = ''', FORMAT(@inputDate, 'yyyy-MM-dd HH:mm:ss'), ''';');
    
    INSERT INTO @myTableVar (col1, col2)
    EXEC sp_executesql @stmt = @dynamicSQL;
    

    As an additional note:

    • you can try to use EXISTS and NOT EXISTS instead of IN and NOT IN.
    • You can try to use a temp table (#myTempTable) instead of a local variable and put some indexes on it. Physical temp tables can perform better with large amount of data and you can put indexes on it. (For more info you can go here: What's the difference between a temp table and table variable in SQL Server? or to the official documentation)