I want to extract a postal code from a blob of text.
My postal code is six digits long and can be 560011
or 560 011
. I have used regex (/[0-9]{3}[ ]?[0-9]{3}/
), but this also captures the first 6 digits of my phone number. I tried using [^0-9]
after my 6th digit, but this captures the next char too. How can I capture only the postal code, neglecting any number more than 6 digits?
I think your solution is to add word boundaries. Like
/\b[0-9]{3} ?[0-9]{3}\b/
or
/\b\d{3} ?\d{3}\b/
if your regex flavor supports the digit character class.
The word boundary - \b
only matches if the character before and after it are from different classes, or rather - one is of the word character class, and the other isn't. The word character class includes digits, so adding \b
before and after your regex, makes it match only if the number is preceded and followed by a non digit (word character).
Also, to have a character class with just one character (the [ ]
) doesn't make any sense. It's the same as just have the character in the regex.