Search code examples
.netregexphone-number

Validate/extract US Phone numbers with .net regular expression out of mixed character value


I need a regular expression for .net that can extract a phone number from a mixed character value, such as the following examples:

yyy1-555-555-5555yyy1
yyy555-555-5555yyy1
yyy1(555)555-5555yyy1
yyy5555555555yyy1
yyy1-(555)-555-5555yyy1
yyy1(555)-555-5555yyy1
yyy(555)555-5555yyy1

The pattern ^\+?([0-9]+[ -]?){5,}[0-9]+$ seems very basic and works fairly well but it isn't working with all the different ways a phone number can be presented as shown above.

I am very new to regular expressions, and this seems like it may be a lot to ask but I would appreciate the help if it's relatively easy for someone to do.


Solution

  • You can do it with the following regex:

    (?:1-?)?\(?\d{3}\)?[-.]?\s*\d{3}[-.]?\s*\d{4}
    

    Or with "boundaries":

    (?<!\d)(?:1-?)?\(?\d{3}\)?[-.]?\s*\d{3}[-.]?\s*\d{4}(?!\d)
    

    See regex demo

    The regex explanation:

    • (?:1-?)? - an optional (1 or zero) sequence of 1 and an optional hyphen
    • \(?\d{3}\)? - 3 digit sequence optionally enclosed with (...)
    • [-.]? - an optional separator (either - or ., add more if necessary)
    • \s* - 0 or more whitespace (if you need to exclude line breaks, use \p{Zs})
    • \d{3} - 3 digit sequence
    • [-.]? - 1 or 0 separators
    • \s* - 0 or more whitespace
    • \d{4} - 4 digit sequence.

    The look-arounds, (?<!\d) and (?!\d), only allow a match if the whole pattern is not enclosed with digits.