Search code examples
phpregexpreg-split

How to match comma outside multiple parentheses by regex


I split a string by comma, but not within parathesis, using preg_split. I came up with

preg_split('#,(?![^\(]*[\)])#',$str);

which works perfectly unless there is a comma before a nested parenthesis.

Works for

$str = "first (1,2),second (child (nested), child2), third";

Array
(
    [0] => first (1,2)
    [1] => second (child (nested), child2)
    [2] =>  third
)

but not for

$str = "first (1,2),second (child, (nested), child2), third";

Array
(
    [0] => first (1,2)
    [1] => second (child
    [2] =>  (nested), child2)
    [3] =>  third
)

Solution

  • You can use recursion matching the balanced parenthesis. Then make use of SKIP FAIL and match the comma to split on.

    (\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,
    

    See a regex demo.

    Example

    $str = "first (1,2),second (child, (nested), child2), third";
    $pattern = "/(\((?:[^()]++|(?1))*\))(*SKIP)(*F)|,/";
    print_r(preg_split($pattern, $str));
    

    Output

    Array
    (
        [0] => first (1,2)
        [1] => second (child, (nested), child2)
        [2] =>  third
    )