Search code examples
regexrubyregex-lookaroundsregex-groupregex-greedy

RegEx for matching specific phone numbers


I'm trying to see if a string matches my country's phone number format, which is the area code (two digits that may or may not be preceded by a 0 and might also be between parenthesis) followed by 8 or 9 digits in which there may be an dash character before the 4 last digits. These are some valid formats:


'00 00000000'
'000-000000000'
'000 00000-0000'
'00 0000-0000'
'(00) 0000-0000'
'(000) 000000000'

So far this is the working expression I have:


p = /0?\d{2}\s?-?\s?\d{4,5}\s?-?\s?\d{4}/

I tried to use a conditional to see if the area code is inside parenthesis with /?(\() 0?\d{2}\)|0?\d{2} \s?-?\s?\d{4,5}\s?-?\s?\d{4}/ but got the (repl):1: target of repeat operator is not specified: /?(\() 0?\d{2}\)|0?\d{2} \s?-?\s?\d{4,5}\s?-?\s?\d{4} error.

What am I doing wrong here?


Solution

  • I believe you can use the following regular expression.

    R = /
        \A            # match beginning of string
        (?:           # begin a non-capture group
          \(0?\d{2}\) # match '(' then an optional `0` then two digits then ')'
        |             # or
          0?\d{2}     # match an optional `0` then two digits
        )             # end the non-capture group
        (?:           # begin a non-capture group
          [ ]+        # match one or more spaces
        |             # or
          -           # match a hyphen
        )             # end the non-capture group
        \d{4,5}       # match 4 or 5 digits
        -?            # optionally match a hyphen
        \d{4}         # match 4 digits
        \z            # match end of string
        /x            # free-spacing regex definition mode
    

    arr = [
      '00 00000000',
      '000-000000000',
      '000 00000-0000',
      '00 0000-0000',
      '(00) 0000-0000',
      '(000) 000000000',
      '(000 000000000',
      '(0000) 000000000'
    ]
    
    arr.map { |s| s.match? R }
      #=> [true, true, true, true, true, true, false, false]
    

    The regex is conventionally written as follows.

    R = /\A(?:\(0?\d{2}\)|0?\d{2})(?: +|-)\d{4,5}-?\d{4}\z/
    

    This should be changed as follows if the leading digits cannot equal zero. (If, for example, '001-123456789' and '(12)-023456789' are invalid.)

    R = /\A(?:\(0?[1-9]\d\)|0?\[1-9]\d)(?: +|-)[1-9]\d{3,4}-?\d{4}\z/