Search code examples
regexoracle-databasestring-matchingregexp-like

Match at least 3 words in any order from some 5 words


I have a group of words:

"dog", "car", "house", "work", "cat"

I need to be able to match at least 3 of them in a text, for example:

"I always let my cat and dog at the animal nursery when I go to work by car"

Here I want to match the regex because it matches at least 3 words (4 words here):

"cat", "dog", "car" and "work"

EDIT 1

I want to use it with Oracle's regexp_like function

EDIT 2

I also need it to work with consecutive words


Solution

  • Since Oracle's regexp_like doesn't support non-capturing groups and word boundaries, the following expression can be used:

    ^((.*? )?(dog|car|house|work|cat)( |$)){3}.*$
    

    Try it out here.

    Alternatively, a larger but arguably cleaner solution is:

    ^(.*? )?(dog|car|house|work|cat) .*?(dog|car|house|work|cat) .*?(dog|car|house|work|cat)( .*)?$
    

    Try it out here.

    NOTE: These will both match the same word used multiple times, e.g. "dog dog dog".

    EDIT: To address the concerns over punctuation, a small modification can be made. It isn't perfect, but should match 99% of situations involving punctuation (but won't match e.g. !dog):

    ^((.*? )?(dog|car|house|work|cat)([ ,.!?]|$)){3}.*$
    

    Try it out here