Search code examples
phpregexpcreremoving-whitespacecurrency-formatting

PCRE Regex matching spaces in formatted currency


For a project I need to substitute spaces with an   if - and only if - they occur inside a predefined currency format.

For example:

EUR 1.2
EUR 1.23
EUR 12
EUR 123
EUR 12 Mio.
EUR 12 345 Mio.
GBP 1 123 456 789 Mio. <---- this one is a problem, only matching the first, second to last and last one, but not those inbetween
USD 12 million
EUR 1.23 billion

So basically [CurrencyPrefix][space][amount[with_spaces]][Suffix]

This is what I've come up with so far:

(?:EUR|USD|GBP)(\ )(?:(?:(?:\d+(\ ))+\d+)|\d+\.\d+|\d+)+(?:(\ )(?:Mio\.|million|billion))?

See: https://regex101.com/r/z73ISR/5

Problem is: it only matches the space 3 times. I need to match it [n] times (see the GBP example).


Solution

  • To match all spaces starting from currency abbrev to all those between and after digits you will need to work with \G metacharacter:

    (?:EUR|USD|GBP|\G(?!^)\d+(?:\.\d+)?)\K +
    

    See live demo here

    This is the explanation:

    • (?: Start of non-capturing group
      • EUR|USD|GBP Match one of the currency names
      • | Or
      • \G(?!^) Start match from where it ends previously
      • \d+(?:\.\d+)? Match a sequence of digits following an optional fractional part
    • ) End of non-capturing
    • \K + Reset match output and immediately look for spaces