I'm trying to write a regex pattern to validate Unique Transaction Identifiers (UTI). See description: here
The UTI consists of two concatenated parts, the prefix and the transaction identifier. Here is a summary of the rules I'm trying to take into account:
. : _ -
I have so far constructed a pattern to validate the UTI for the first 4 of these points (matched with ignored casing):
^[A-Z0-9]{11}((\w|[:\.-]){0,30}[A-Z0-9])?$
However I'm struggling with the last point (no two special characters in a row). I readily admit to being a bit of a novice when it comes to regex and I was thinking there might be some more advanced technique that I'm not familiar with to accomplish this. Any regex gurus out there care to enlighten me?
Solved: Thanks to user Bohemian for helping me find the pattern I was looking for. My final solution looks like this:
^[a-zA-Z0-9]{11}((?!.*[.:_-]{2})[a-zA-Z0-9.:_-]{0,30}[a-zA-Z0-9])?$
I will leave the question open for follow-up answers in case anyone has any further suggestions for improvements.
Try this:
^[A-Z0-9]{11}(?!.*[.:_-]{2})[A-Z0-9.:_-]{0,30}[A-Z0-9]$
The secret sauce is the negative look ahead (?!.*[.:_-]{2})
, which asserts (without consuming input) that the following text does not contain 2 consecutive "special" chars .:_-
.
Note that your attempt, which uses \w
, allows lowercase letters and underscores too, because
\w
is the same as [a-zA-Z0-9_]