Search code examples
phpregexpreg-replacepreg-matchphone-number

splitting a string into a phone number and extension using preg_match


So I'm trying to split a string that contains a phone number and extension, as sometimes an extension exists in the string. This is my attempt:

$tests[] = "941-751-6550 ext 2204";
$tests[] = "(941) 751-6550 ext 2204";
$tests[] = "(941)751-6550 ext 2204";
$tests[] = "9417516550 ext 2204";
$tests[] = "941-751-6550 e 2204";
$tests[] = "941-751-6550 ext 2204 ";
$tests[] = "941-751-6550 extension 2204";
$tests[] = "941-751-6550 x2204";
$tests[] = "(941) 751-6550";
$tests[] = "(941)7516550";
$tests[] = "941-751-6550 ";
$tests[] = "941-751-6550";

foreach ($tests as $test) {
    preg_match('#([\(\)\s0-9\-]+)(.+$)#',$test,$matches);
    $phone = preg_replace('#[\-\(\)\s]#','',$matches[1]);
    $extension = preg_replace('#[^0-9]#','',$matches[2]);
    if ($phone == '9417516550' 
        && ($extension == '2204' 
            || $extension == '0')) {
                echo "PASS: phone: $phone ext: $extension<br />";
    } else {
        echo "FAIL: phone: $phone ext: $extension<br />";
    }
}

However, when I run these tests to see if it properly splits the phone number and the extension, I get the following output:

PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
PASS: phone: 9417516550 ext: 2204
FAIL: phone: 941751655 ext: 0
FAIL: phone: 941751655 ext: 0
FAIL: phone: 9417516550 ext: 
FAIL: phone: 941751655 ext: 0

As you can see, it breaks when I exclude an extension altogether (the last four tests). How might I correct the preg_match() regex so that the FAIL: ... lines look like PASS: phone: 9417516550 ext: 0?


Solution

  • (.+$) means that in the end of a line must be 1 or more symbol. So, if you have nothing after phone number - then your phone number is reduced by 1 symbol.

    I advise to use (.*$) which means zero or more symbols.