Search code examples
regexstringrpostal-code

Regex extract characters depending on string length


I want to extract the outcode of a UK postcode. All the background is here: UK Postcode Regex (Comprehensive)

But it's not validation, so this should do:

  • first 2 characters for a postcode of length 5
  • first 3 characters for a postcode of length 6
  • first 4 characters for a postcode of length 7

All postcodes are converted to upper case and no spaces.

I cannot figure out how to specify dynamic ranges. Or whatever works.

Pseudo-code ^[A-Z0-9]{length(postcode) - 3}

Added: I'm using R.


Solution

  • The language or environment (or rather the regex flavor) you're using would be helpful (always, in any regex question), but in most cases this should do:

    ^([A-Z0-9]{2,})[A-Z0-9]{3}$
    

    So we match and capture 2 or more characters in group 1, and then require exactly 3 more until the end of the string. How you access the captures depends on your environment.

    If your regex flavor supports lookaheads you get away without using captures as well:

    ^[A-Z0-9]{2,}(?=[A-Z0-9]{3}$)
    

    This ensures that the end of the match is followed by three characters and the end of the string, but does not include this part in the match.