Search code examples
sqlsql-server-2005search-engine

SQL Keyword search algorithm: This SQL does a sequential search, how to do an indexed search?


We have three tables to hold our products and keywords:

Product {int ID, string name, ...}
ProductKeyword {int productID, int keywordID}
Keyword {int ID, string keyword}

This SQL code returns the most relevant products to the least relevant products having keywords like the users search criteria. searchWordTable is a table of search words. @keywordCount is a count of the search words. This returns all the products having one or more keywords, ordered by the quantity of keywords found for each product.

select productid, productname, count(*) * 1 / @keywordCount as percentRelevant
from (select keyword, productid, productname
      from   product
             join productkeyword on ... 
             join keyword on ...
             join searchWordTable on searchwordtable.keyword like 
                  '%' + keyword.keyword + '%') K -- like join aweful
group by productid, productname
order by percentRelevant desc -- Most relevant first

The problem is this is a sequential search comparing every keyword we have. It's not bad, but searches can take a minute with a million records.

How could I rewrite the query to not use like, hopefully use an indexed search, and get similar results? They use like to get partial matches, e.g., 'bone' in 'boneless'.

Links to better SQL algorithms would surely be appreciated.


Solution

  • Like is killing you, especially with that leading wildcard which completely removes any indexes the columns may have.

    You should look into using SQL Server's full-text indexing. contains will probably be much faster and is far more powerful for partial matches of the nature you appear to be doing.