Search code examples
sql-servert-sqlcase-sensitivestop-wordsfull-text-indexing

SQL Server oddity with full-text indexing and case sensitivity and numbers


This problem may be unique to our server, but I can't tell from the symptoms where the issue may lie.

I have a field (searchtitle) in a table (fsItems) that has a full-text index turned ON. For the record with primary key (IDItem) 704629 the content of this field is "TEST AFA 0 TEST".

Surprisingly, the following query returns no results:

SELECT * FROM fsItems WHERE CONTAINS(searchtitle,'AFA') AND IDItem = 704629

However, if I change the content to be "TEST afa 0 TEST" or "TEST AFA O TEST" (capital "O" instead of zero) the query returns the record. (It also returns the record if I change the content to "TEST AFB 0 TEST" and the CONTAINS argument to 'AFB'.)

At first I thought maybe AFA was some kind of stop word, but that wouldn't explain why changing zero to upper-case "O" returns the proper result.

Any idea what is going on here?

Thanks for any suggestions


Solution

  • Very interesting little quirk. It appears SQL Server is considering "AFA 0" as a single "word". My guess is this is an issue with the word breakers configured for standard English. It appears you can manually adjust them, but it doesn't look simple or intuitive. See Microsoft's how to documentation here

    Identifying Words in Full-text Index

    The below script lists every word in a full text index. If you run this against your table, you'll see in column display_term word "AFA 0". Side note: this script is also very useful in optimizing full text indexes by identifying "noisy" words to add to your stop list

    Select *
    From sys.dm_fts_index_keywords(Db_Id(),Object_Id('dbo.tbl_fulltext_test') /*Replace with your table name*/)
    Order By document_count Desc
    

    Full SQL Used to Identify the Issue

    CREATE TABLE tbl_fulltext_test
    (ID int constraint PK_fulltext_test primary key identity (1,1) 
    ,String Varchar(1000)
    )
    
    Create Fulltext Catalog ct_test
    With Accent_Sensitivity = Off
    
    Create Fulltext Stoplist sl_test
    From System Stoplist;
    
    Create Fulltext Index On tbl_fulltext_test(String)
    Key Index PK_fulltext_test On ct_test
    With Stoplist = sl_test,Change_Tracking Auto;
    
    
    INSERT INTO tbl_fulltext_test
    VALUES 
     ('TEST AFA 0 TEST') /*Zero*/
    ,('TEST afa 0 TEST') /*Zero*/
    ,('TEST AFB 0 TEST') /*AFB*/
    ,('TEST AFA O TEST') /*Letter O*/
    
    /*Returns rows 2 and 4*/
    SELECT *
    FROM tbl_fulltext_test
    WHERE CONTAINS (String,'AFA')
    
    /*Returns row 1*/
    SELECT *
    FROM tbl_fulltext_test
    WHERE CONTAINS (String,'"AFA 0"')