Search code examples
.netregexexpresso

What's the regex for restricted number of numeric characters?


Having trouble figuring out a regex issue.

We are looking for 2 numbers then hyphen or space then 6 numbers. Must be only 6 numbers, so either an alpha character or some punctuation or space must follow the 6 numbers or the 6 numbers must be at the end of the string.

Other numbers are allowed elsewhere in the string, as long as they are separate.

So, these should match:

foo 12-123456 bar  
12-123456 bar  
foo 12-123456  
foo12-123456bar  
12-123456bar  
foo12-123456  
12-123456bar 99
foo12-123456 99 

These should not match:

123-12345 bar  
foo 12-1234567  
123-12345bar  
foo12-1234567  

Here's what we were using:

\D\d{2}[-|/\ ]\d{6}\D

and in Expresso this was fine.

But running for real in our .net application this pattern was failing to match on examples where the 6 numbers were at the end of the string.

Tried this:

\D\d{2}[-|/\ ]\d{6}[\D|$]

and it still doesn't match

foo 12-123456

Solution

  • I would restate your pattern from

    Must be only 6 numbers, so either an alpha character or some punctuation or space must follow the 6 numbers or the 6 numbers must be at the end of the string.

    to

    Must be only 6 numbers, so there must not be a number after the sixth number

    and then use a negative look-ahead assertion to express this. Similarly, at the start of the pattern use a negative look-behind assertion to say that whatever is before the first two digits, it isn't a digit. Together:

    var regex = new Regex(@"(?<!\d)\d{2}[- ]\d{6}(?!\d)");
    
    var testCases = new[]
                        {
                            "foo 12-123456 bar",
                            "12-123456 bar",
                            "foo 12-123456",
                            "foo12-123456bar",
                            "12-123456bar",
                            "foo12-123456",
                            "123-12345 bar",
                            "foo 12-1234567",
                            "123-12345bar",
                            "foo12-1234567",
                        };
    
    foreach (var testCase in testCases)
    {
        Console.WriteLine("{0} {1}", regex.IsMatch(testCase), testCase);
    }
    

    This produces six Trues then four Falses, as required.

    The assertions (?<!\d) and (?!\d) respectively say 'there isn't a digit just before here' and 'there isn't a digit just after here'.