Search code examples
t-sqlindexingsql-server-2008-r2query-performance

Convert to SARGable query


I want to write a query to search the containing string in the table.

Table:

Create table tbl_sarg
(
    colname varchar(100),
    coladdres varchar(500)
);

Note: I just want to use Index Seek for searching on 300 millions of records.

Index:

create nonclustered index ncidx_colname on tbl_sarg(colname);

Sample Records:

insert into tbl_sarg values('John A Mak','HNo 102 Street Road Uk');
insert into tbl_sarg values('Shawn A Meben','Church road USA');
insert into tbl_sarg values('Lee Decose','ShopNo 22 K Mark UK');
insert into tbl_sarg values('James Don','A Mall, 90 feet road UAE');

Query 1:

select * from tbl_sarg
where colname like '%ee%'

Actual Execution Plan:

enter image description here

Query 2:

select * from tbl_sarg
where charindex('ee',colname)>0

Actual Execution Plan:

enter image description here

Query 3:

select * from tbl_sarg
where patindex('%ee%',colname)>0

Actual Execution Plan:

enter image description here

How to force the query processor to use the index seek instead table/index scan on large data set?


Solution

  • A search argument, or SARG in short, is a filter predicate that enables the optimizer to rely on index order. The filter predicate uses the following form (or a variant with two delimiters of a range, or with the operand positions flipped): WHERE <column> <operator> <expression>

    Such a filter is sargable if:

    1. You don’t apply manipulation to the filtered column.

    2. The operator identifies a consecutive range of qualifying rows in the index. That’s the case with operators like =, >, >=, <, <=, BETWEEN, LIKE with a known prefix, and so on. That’s not the case with operators like <>, LIKE with a wildcard as a prefix.

    In most cases, when you apply manipulation to the filtered column, the optimizer doesn’t try to be too smart and understand the meaning of the calculation, and if index ordering can still be relied on. It simply assumes that the result values might sort differently than the source values, and therefore index ordering can’t be trusted.

    So why doesn’t SQL Server use the index for the %ee% query? Pretend for a moment that you held a phone book in your hand, and I asked you to find everyone whose last name contains the letters %ee%. You would have to scan every single page in the phone book, because the results would include things like:

    • Anne Lee

    • Lee Yung

    • Kathlee

    • Aleen

      When I asked you for all last names containing %ee% anywhere in the name, my query was not sargable – meaning, you couldn’t leverage the indexes to do an index seek.

    That’s where SQL Server’s Full Text Search comes in.