Search code examples
regexregex-lookarounds

Match digits (which may contain spaces) except when preceded by a specific word


I have this regular expression to look for numbers in a text that do not belong to a price (in euro):

(?<!EUR )(\d\s*)+

I want it to not match:

EUR 10 000

And I want it to match the numbers in the following cases:

{3}  
10 000  
347835

The problem I now face is that it matches the numbers I want it to fine, but it does also matches the 0 000 part of the text that I don't want it to match.

Edit:
I want to match all numbers (including spaces in between numbers) unless they are preceded by "EUR ". To make it more clear what I want to match I will provide all the cases from above and make the things I want to match bold:

EUR 10 000
{3}
10 000
347835

What my regular expression currently matches is:
EUR 10 000
{3}
10 000
347835


Solution

  • As you are already using a capture group, you can match what you don't want and capture what you want to keep.

    \bEUR *\d+(?:[ \d]*\d)?\b|\b(\d+(?: +\d+)*)\b
    

    Explanation

    • \bEUR * Match EUR and optional spaces
    • \d+(?:[ \d]*\d)?\b Match 1+ digits and optional spaces and digits ending on a digit followed by a word boundary
    • | Or
    • \b A word boundary to prevent a partial word match
    • ( Capture group 1 (The value that you are interested in)
      • \d+(?: +\d+)* Match 1+ digits and optionally repeat 1+ spaces and 1+ digits
    • ) Close group 1
    • \b A word boundary

    Regex demo

    Note that you can also use \s instead of a space, but that can also match newlines.