Search code examples
regex

Zip code validation to exclude some formats


I am trying to find a "correct" US ZIP code validation but can't. All of them suggesting some version of 5 digits, optionally followed by a dash and 4 digits. In its simplest form:

(^\d{5}$)|(^\d{5}-\d{4}$)

This and all of its variations pass "invalid" ZIP codes 00000 and 00000-0000. US Postal ZIP can potentially start with 2 zeros (00501 - Holtsville, New York) but not more than two zeros.

I need to modify the above expression to exclude 000dd and 000dd-dddd. Last 4 can be all zeros, not important.

So, 00000, 0000d, 000dd should fail.


Solution

  • First, you can factor your existing regular expression so that the first 5 digits are stated once, and the last 4 digits are stated as an optional group, like this:

    ^\d{5}(-\d{4})?$
    

    One way to exclude anything beginning with three "0" is to insert a negative lookahead assertion:

    ^(?!000)\d{5}(-\d{4})?$
    

    Another way that is purer (from a computer science theoretical framework), but more verbose, is to manually enumerate what happens if the first N digits are "0":

    ^([1-9]\d{4}|0[1-9]\d{3}|00[1-9]\d{2})(-\d{4})?$