Search code examples
phpregexstringsplitpreg-match

split string in numbers and text but accept text with a single digit inside


Let's say I want to split this string in two variables:

$string = "levis 501";

I will use

preg_match('/\d+/', $string, $num);
preg_match('/\D+/', $string, $text);

but then let's say I want to split this one in two

$string = "levis 5° 501";

as $text = "levis 5°"; and $num = "501";

So my guess is I should add a rule to the preg_match('/\d+/', $string, $num); that looks for numbers only at the END of the string and I want it to be between 2 and 3 digits. But also the $text match now has one number inside...

How would you do it?


Solution

  • To slit a string in two parts, use any of the following:

    preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
    

    This regex matches:

    • ^ - the start of the string
    • (.*?) - Group 1 capturing any one or more characters, as few as possible (as *? is a "lazy" quantifier) up to...
    • \s* - zero or more whitespace symbols
    • (\d+) - Group 2 capturing 1 or more digits
    • \D* - zero or more characters other than digit (it is the opposite shorthand character class to \d)
    • $ - end of string.

    The ~s modifier is a DOTALL one forcing the . to match any character, even a newline, that it does not match without this modifier.

    Or

    preg_split('~\s*(?=\s*\d+\D*$)~', $s);
    

    This \s*(?=\s*\d+\D*$) pattern:

    • \s* - zero or more whitespaces, but only if followed by...
    • (?=\s*\d+\D*$) - zero or more whitespaces followed with 1+ digits followed with 0+ characters other than digits followed with end of string.

    The (?=...) construct is a positive lookahead that does not consume characters and just checks if the pattern inside matches and if yes, returns "true", and if not, no match occurs.

    See IDEONE demo:

    $s = "levis 5° 501";
    preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
    print_r($matches[1] . ": ". $matches[2]. PHP_EOL);
    print_r(preg_split('~\s*(?=\s*\d+\D*$)~', $s, 2));