Search code examples
sql-servert-sqlwhitespacesql-like

SQL - Including whitespace in LIKE query for filtering content include swear words


I have a table of swear words in SQL Server and I use LIKE query to search texts for words in the table. I need a way to include whitespaces around the swear word in LIKE query, like this:

... LIKE '%{whitespace}SWEAR-WORD{whitespace}%';

Putting space around the swear word is not enough, because it can be a part of another normal word in my language (like 'inter' that is part of 'international' or 'pointer').

Another solution I've tried was using this:

... LIKE '%[^a-zA-Z]SWEAR-WORD[^a-zA-Z]%';

But that did not work for me.

Is there any way to do this? Or alternatively is there any solution other than LIKE query?

Edit: For better understanding, it's our current way to find swear-words:

We have a table named Reviles which has 2 columns (Id and Text) and contains restricted words and phrases. We use this query to find out whether a content has any of those restricted words and phrases:

IF EXISTS (SELECT * dbo.Reviles WHERE @Text LIKE '%' + dbo.Reviles.Text + '%')
  @IsHidden = 0

Note that this check is done before the content being inserted into its table. The code above is part of a stored procedure which gets information of a post and checks various things including swear words before inserting it.

Before we've stored restricted words like ' swear-word ' in the table, however this way we could not find and hide contents with swear words at the beginning or at the end of the line or contents which consists of only a swear word. For example: This is my content with a swear-word or Swear-word in my content or Swear-word So we decided to remove those spaces and store restricted words like 'swear-word'. But this causes some normal content to hide because some swear words can be part of another word which is normal (If we assume inter is a bad word, then pointer and international, etc. will be restricted).

Sorry for my bad English, I hope with this description, I've made it clear.


Solution

  • try to close your check statement in some chars and then compare:

    some data:

    declare @T table(stmt nvarchar(20))
    insert into @T values ('inter'),('Inter.'),('My inter'),
    ('intermediate!'),('pointer '),('Good inter'),('inter inter inter')
    

    try this:

    select
        stmt as stmt,
        case
            when '.'+stmt+'.' like '%[^a-z]inter[^a-Z]%' then 1 else 0 end as [has inter]
    from
        @T
    

    results:

    stmt                 has inter
    -------------------- -----------
    inter                1
    Inter.               1
    My inter             1
    intermediate!        0
    pointer              0
    Good inter           1
    inter inter inter    1