Search code examples
sqlt-sqlsql-server-2008scriptingsql-scripts

SQL to filter by multiple criteria including containment in string list


so i have a table lets say call it "tbl.items" and there is a column "title" in "tbl.items" i want to loop through each row and for each "title" in "tbl.items" i want to do following: the column has the datatype nvarchar(max) and contains a string...

  1. filter the string to remove words like in,out, where etc (stopwords)
  2. compare the rest of the string to a predefined list and if there is a match perform some action which involves inserting data in other tables as well..

the problem is im ignotent when it comes to writing T-sql scripts, plz help and guide me how can i achieve this? whether it can be achieved by writing a sql script?? or i have to develope a console application in c# or anyother language??

im using mssql server 2008

thanks in advance


Solution

  • You want a few things. First, look up SQL Server's syntax for functions, and write something like this:

    -- Warning! Code written off the top of my head,
    -- don't expect this to work w/copy-n-paste
    create function removeStrings(@input nvarchar(4000))
    as begin
        -- We're being kind of simple-minded and using strings
        -- instead of regular expressions, so we are assuming a
        -- a space before and after each word.  This makes this work better:
        @input = ' ' + @input
    
        -- Big list of replaces
        @input = replace(' in ','',@input)
        @input = replace(' out ','',@input)
        --- more replaces...
    end
    

    Then you need your list of matches in a table, call this "predefined" with a column "matchString".

    Then you can retrieve the matching rows with:

    select p.matchString
      from items i
      join predefined p 
        on removeStrings(i.title) = p.matchString
    

    Once you have those individual pieces working, I suggest a new question on what particular process you may be doing with them.

    Warning: Not knowing how many rows you have or how often you have to do this (every time a user saves something? Once/day?), this will not exactly be zippy, if you know what I mean. So once you have these building blocks in hand, there may also be a follow-up question for how and when to do it.