Search code examples
regexutiunique-transaction-identifier

Regex to validate Unique Transaction Identifier


I'm trying to write a regex pattern to validate Unique Transaction Identifiers (UTI). See description: here

The UTI consists of two concatenated parts, the prefix and the transaction identifier. Here is a summary of the rules I'm trying to take into account:

  • The prefix is exactly 10 alphanumeric characters.
  • The transaction identifier is 1-32 characters long.
  • The transaction identifier is alphanumeric, however the following special characters are also allowed: . : _ -
  • The special characters can not appear at the beginning or end of the transaction identifier.
  • It is not allowed to have two special characters in a row.

I have so far constructed a pattern to validate the UTI for the first 4 of these points (matched with ignored casing):

^[A-Z0-9]{11}((\w|[:\.-]){0,30}[A-Z0-9])?$

However I'm struggling with the last point (no two special characters in a row). I readily admit to being a bit of a novice when it comes to regex and I was thinking there might be some more advanced technique that I'm not familiar with to accomplish this. Any regex gurus out there care to enlighten me?


Solved: Thanks to user Bohemian for helping me find the pattern I was looking for. My final solution looks like this:

^[a-zA-Z0-9]{11}((?!.*[.:_-]{2})[a-zA-Z0-9.:_-]{0,30}[a-zA-Z0-9])?$

I will leave the question open for follow-up answers in case anyone has any further suggestions for improvements.


Solution

  • Try this:

    ^[A-Z0-9]{11}(?!.*[.:_-]{2})[A-Z0-9.:_-]{0,30}[A-Z0-9]$
    

    The secret sauce is the negative look ahead (?!.*[.:_-]{2}), which asserts (without consuming input) that the following text does not contain 2 consecutive "special" chars .:_-.


    Note that your attempt, which uses \w, allows lowercase letters and underscores too, because \w is the same as [a-zA-Z0-9_]