Search code examples
regexnumber-formatting

Regex to remove commas from numbers under 10,000


I need a single regex to strip commas from all numbers under 10,000, e.g., 9999 but 10,000, 1,000,000, etc.

This works fine for 9,999, correctly ignores 10,000, but screws up 1,000,000 (1000,000):

\b([0-9]),([0-9]{3}) 
$1$2

I can't simply rule out a comma after a 4-digit number like 9999, unfortunately. I tried another approach, but it misses 9,999:

\b(?<![.,])(?<d1>\d{2})(?<d2>\d{3})(?!,)\b
$1,$2

Any ideas? Thanks, Randy

UPDATE: Sorry, I forgot to mention this must work in .NET, so \K won't work...


Solution

  • To match numbers under 10,000, you could match a single digit before the comma instead of 2, and match 1-3 digits after the comma to also match 1,9 for example.

    To prevent a partial match, you could assert whitespace boundaries.

    (?<!\S)(?<d1>\d),(?<d2>\d{1,3})(?!\S)
    

    Regex demo