Search code examples
javascriptprofanity

Javascript profanity match NOT replace


I am building a very basic profanity filter that I only want to apply on some fields on my application (fullName, userDescription) on the serverside.

Does anyone have experience with a profanity filter in production? I only want it to:

'ass hello' <- match
'asster' <- NOT match

Below is my current code but it returns true and false on in succession for some reason.

var badWords = [ 'ass', 'whore', 'slut' ]
  , check = new Regexp(badWords.join('|'), 'gi');

function filterString(string) {
  return check.test(string);
}

filterString('ass'); // Returns true / false in succession.

How can I fix this "in succession" bug?


Solution

  • The test method sets the lastIndex property of the regex to the current matched position, so that further invocations will match further occurrences (if there were any).

    check.lastIndex // 0 (init)
    filterString('ass'); // true
    check.lastIndex // 3
    filterString('ass'); // false
    check.lastIndex // now 0 again
    

    So, you will need to reset it manually in your filterString function if you don't recreate the RegExp each time:

    function filterString(string) {
        check.lastIndex = 0;
        return check.test(string);
    }
    

    Btw, to match only full words (like "ass", but not "asster"), you should wrap your matches in word boundaries like WTK suggested, i.e.

    var check = new Regexp("\\b(?:"+badWords.join('|')+")\\b", 'gi');