Search code examples
regexpcre

Match Single Word Not Containing dot


Taking a string something like this:

Line 1: Test  TableA
Line 2:    TableA  AWord
Line 3: TableA AWord
Line 4: This.TableA
Line 5: This. TableA Aword

I want to match where these criteria are met:

  1. The word TableA is found
  2. There is no dot anywhere on the same line where TableA is found
  3. There may be any number of spaces or other characters in front of the word TableA
  4. There may be characters after the word TableA

So in the scenario above:

  • Line 1,2 & 3 should all match - but ONLY on the word TableA
  • Line 4 & 5 should NOT match

I'm having some real trouble getting this to work though.

-

This matches on every line except #3 - and matches from the start of the line to the end of TableA

^([^\.].*)(?:TableA)

-

This matches Line 1,2,3 & 5 and for 1 & 2 it matches from the start of the line to the end of TableA

(?!\.).(\s)*(TableA)(?=\s|$)

-

This matches 1,2,3 (closest i've gotten to the right answer) but matches from the start of the line to the end of TableA

^(?!.*\.).*(TableA)

This thread: Regex: Match word not containing Contained a solution that does a very similar thing to what I've managed to output, but again, it matches every character in front of the specific word found.

This is in PowerShell - so i believe PCRE is effectively what it's using(?)


Solution

  • you could exclude newlines as well as a negated character class matching not a dot [^.] will also match a newline.

    To match the word TableA you could use lookarounds (?<!\S) and (?!\S) to assert no non whitespace chars around it to prevent matching $TableA$

    The value is in the first capturing group.

    ^[^\r\n.]*(?<!\S)(TableA)(?!\S)[^\r\n.]*$
    

    In parts

    • ^ Start of string
    • [^\r\n.]* Match 0+ times not a . or a newline
    • (?<!\S)TableA(?!\S) Match TableA not surrounded by non whitespace chars
    • [^\r\n.]* Match 0+ times not a . or a newline
    • $ End of string

    Regex demo

    If you want to use PCRE, you could make use of \K and a positive lookahead:

    ^[^\r\n.]*\K(?<!\S)\KTableA(?!\S)(?=[^\r\n.]*$)
    

    Regex demo