Search code examples
sql-server-2012full-text-searchquery-performancetable-valued-parametersquery-tuning

Searching 13 million records using full text search with additional conditions


Performance issue while doing SQL Server full text search with additional conditions. (SQL Server 2012)

I am trying to filter the data based on search filters list (table value parameter), which will return all the records for match filters and single record for the filter doesn't have any record from tables.

Full text search index is already on table Names for column SNAME.

In stored procedure, table type parameter SearchFilter is used to pass list of name and address info.

Both tables have more than 14 million records, when we execute the procedure with 1000 unique records passed in filters list it took around 7 minutes to return the result (1400 records).

Filter criteria is: contains(name) and streetaddress, city, state, zip exact match.

Is there any alternate to avoid while loop as SQL Server CONTAINS function required string value or variable?

CREATE TABLE [dbo].[Names]
(
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [UIN] [varchar](9) NULL,
    [SNAME] [varchar](500) NULL,
    CONSTRAINT [PK_Names] 
        PRIMARY KEY CLUSTERED ([ID] ASC)
)

CREATE TABLE [dbo].[ADDRESSES]
(
    [UIN] [varchar](9) NULL,
    [STREET1] [varchar](100) NULL,
    [STREET2] [varchar](50) NULL,
    [CITY] [varchar](30) NULL,
    [STATE] [varchar](2) NULL,
    [ZIP] [varchar](10) NULL    
) ON [PRIMARY]

CREATE TYPE [dbo].[SearchFilter] AS TABLE
(
    [UIN] [varchar](40) NULL,
    [SNAME] [varchar](max) NULL,
    [StreetAddress] [varchar](max) NULL,
    [City] [varchar](max) NULL,
    [State] [varchar](50) NULL,
    [Zip] [varchar](20) NULL
)

-- Stored procedure logic
DECLARE @filterList AS [dbo].[SearchFilter]

DECLARE @NoOfRows INT, @counter INT = 0

SET @NoOfRows = (SELECT COUNT(1) FROM @filterList)

DECLARE @result TABLE (UIN varchar(40), 
                       NAME varchar(500), 
                       StreetAddress varchar(1000), 
                       Zipcode varchar(20),
                       State varchar(20),
                       City varchar(1000),
                       IsRecordFound varchar(50)
                      );

WHILE (@NoOfRows > @counter)
BEGIN
    DECLARE @SearchName VARCHAR(4000)

    SET @SearchName = (SELECT '"'+SNAME+'"' FROM @filterList ORDER BY SNAME OFFSET @counter ROWS FETCH NEXT 1 ROWS ONLY)  

    --Start: Process to Select Records
    ;WITH Filter_CTE AS
    (
        SELECT 
            SNAME, StreetAddress, City, State, ZipCode 
        FROM
            @filterList 
        ORDER BY 
            SNAME 
            OFFSET @counter ROWS FETCH NEXT 1 ROWS ONLY 
    )
    INSERT INTO @result (UIN, NAME, STREETADDRESS, CITY, STATE, ZIPCODE, PHONE, IsRecordFound)
        SELECT DISTINCT 
            en.UIN, ISNULL(en.SNAME, Filter_CTE.SNAME),
            Filter_CTE.StreetAddress, Filter_CTE.ZipCode,
            Filter_CTE.state, Filter_CTE.City,
            IIF(en.UIN IS NULL, 'Not Found', 'Found') AS IsRecordFound 
        FROM 
            dbo.Names en 
        INNER JOIN 
            dbo.ADDRESSES ea ON en.UIN = ea.UIN
        RIGHT JOIN 
            Filter_CTE ON ea.ZIP = Filter_CTE.Zip 
                       AND ea.STATE = Filter_CTE.State 
                       AND ea.CITY = Filter_CTE.City 
                       AND (ISNULL(ea.STREET1, '') + ' ' + ISNULL(ea.STREET2, '')) = Filter_CTE.StreetAddress
                       AND CONTAINS(en.SNAME,@SearchName)
            --END

    SET @counter += 1
END 

SELECT 
    UIN, NAME, STREETADDRESS, CITY, STATE, ZIPCODE, PHONE 
FROM 
    @result 

Solution

  • Currently it is not possible to use column names as search condition in CONTAINS or CONTAINSTABLE. So, you cannot do direct JOIN between data table and the SearchFilter table with FTS predicates applied.

    The current solution found in other questions/forums is to loop through the filters list and feed CONTAINS with search condition in a variable, just as you do. So, you won't get rid of this loop.

    However, looking at your query I see a number of other problems which may affect performance:

    1. DISTINCT clause in INSERT INTO @result ... SELECT DISTINCT .... It's on the level where you JOIN to tables with millions of records. Though I understand that final result may contain only a few thousands of rows, it's better to move DISTINCT to this line:

      SELECT DISTINCT
          UIN, NAME, STREETADDRESS, CITY, STATE, ZIPCODE, PHONE 
      FROM 
          @result 
      
    2. This condition AND (ISNULL(ea.STREET1, '') + ' ' + ISNULL(ea.STREET2, '')) = Filter_CTE.StreetAddress is certainly NOT SARGable. You use concatenation and function (ISNULL()) which prevents SQL Server from using existing indexes over dbo.ADDRESSES ea table. Check this question: What makes a SQL statement sargable? to see how to construct JOIN / WHERE conditions in such a way that will allow the use of indexes. In this particular case it's better to add a computed column to the dbo.Addresses table and then build an index over it (or add it to the existing index):

      CREATE TABLE [dbo].[ADDRESSES]
      (
          ...
          STREET as (ISNULL(ea.STREET1, '') + ' ' + ISNULL(ea.STREET2, '')),
          ...
      )
      

    So fix the above 1. and 2. then comment the AND CONTAINS(en.SNAME,@SearchName) condition in RIGHT JOIN and notice execution time. Afterwards, uncomment the CONTAINS condition and see how much delay was added. This way you will know for sure if it's FTS engine to blame for the delay or your main query itself needs improvements.

    To be able to advise more, we need to see the execution plans for your procedure. You can share your query execution plan using this page: https://www.brentozar.com/pastetheplan/ .

    HTH