Search code examples
regexregex-groupregex-negation

Regex pattern to capture numbers only, on a single line, but exclude date pattern


Here is my current regex:

\b\d(?:[^\n\d]*\d){14}(?!\d)

Explanation:

  • \b: Matches a word boundary so that first match digit is matched in a separate word
  • [^\n\d]*: Matches 0 or more of any char that is not a digit and not a line break
  • (?:[^\n\d]*\d){14}: Matches 14 digits optionally interspersed by 0 or more non-digits, non-line breaks

The problem with this pattern is that it matches various date pattern that should be excluded as shown in this RegEx Demo. You can see problem matches here identified in red:

enter image description here

I need to modify the current regex so that it will not match if it includes a date pattern. Here is the regex that I came up with to identify the various combinations of date patterns:

(?:[0-9]{1,4}[\-\/\\\.][0-9]{1,2}[\-\/\\\.][0-9]{1,4}){1}

This seems to match the required date patterns as shown here:

enter image description here

Now, if I could just figure out how to exclude the second pattern date matches from the first pattern of 15-digit matches. That is what I am trying to accomplish.

Any ideas?


Solution

  • You may try this regex:

    \b\d(?!(?:\d{0,3}([-\/\\.])\d{1,2}\1\d{1,4})\b(?!\S))(?:[^\n\d]*\d){14}\b
    

    RegEx Demo

    Here (?!(?:\d{0,3}([-\/\\\.])\d{1,2}\1\d{1,4})\b(?!\S)) is a negative lookahead pattern to avoid including dates in your matches.