Search code examples
regexpcreregex-negation

Match any string that deviates at least one character from a given string


I came across an issue where I needed to search for a number in a set of numbers where all start the same but end differently, like:

261234
261235
261236
261334
261244
261134 
260234
260134
260123

I may add that this was in a log file with some other gibberish which may complicate the issue. Say I want to match anything but 261134.

My first instinct was to try 26[^1][^1][^3][^4] but that matches none of the above because at some point each and anyone of the has one of the negated characters in the position I specified a negation for. The next closest thing to a solution i can think of is a mile long expression that would chain ors together like so:

26([^1]134|1[^1]34|11[^3]4|113[^4])

This however does not match everything yet. Instead I should do:

26([^1]\d{3}|\d[^1]\d\d|\d\d[^3]\d|\d{3}[^4])

I think I answered my own question by thinking about it more thoroughly while typing this, but I am still curious if there is a better solution as this was really unwieldy to figure out let alone type for such a simple problem. I could neither find a question about this nor a solution so I hope it is appropriate to leave this here to help others with a similar issue.


Solution

  • Using a negated character class like [^1] expects a match and does match any character except char 1

    Using this pattern 26([^1]\d{3}|\d[^1]\d\d|\d\d[^3]\d|\d{3}[^4]) will match values like 26$333 or 261)33


    What you could do is to match 26 and use a negative lookahead (?! to assert what is directly on the right is not 1134. It that assertion succeeds, match 4 digits.

    To prevent the digits being part of a larger word, you could use word boundaries \b

    \b26(?!1134)\d{4}\b
    

    Regex demo