Search code examples
phpregexpcre

How to preg_split all character, but don't split <b> and <br>


There are tons of questions about [preg_split] here, but none relates to my problem. I'm using the following code to split strings to characters in PHP, like this:

$str = "My <b>table</b> in brown <br> Help";
$char = preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY);
print_r($char);

Output is:

Array
(
    [0] => M
    [1] => y
    [2] =>  
    [3] => <
    [4] => b
    [5] => >
    [6] => t
    [7] => a
    [8] => b
    [9] => l
    [10] => e
    [11] => <
    [12] => /
    [13] => b
    [14] => >
    [15] =>  
    [16] => i
    [17] => n
    [18] =>  
    [19] => b
    [20] => r
    [21] => o
    [22] => w
    [23] => n
    [24] =>  
    [25] => <
    [26] => b
    [27] => r
    [28] => >
    [29] => ...
)

But I expect the following:

Array
(
    [0] => M
    [1] => y
    [2] =>  
    [3] => <b>
    [6] => t
    [7] => a
    [8] => b
    [9] => l
    [10] => e
    [11] => </b>
    [15] =>  
    [16] => i
    [17] => n
    [18] =>  
    [19] => b
    [20] => r
    [21] => o
    [22] => w
    [23] => n
    [24] =>  
    [25] => <br>
    [29] => ...
)

Characters such as: <b>,</b>,<br>,<i>,</i> etc. shouldn't split.

Thank you.


Solution

  • You can do this by splitting on either a single character, or a sequence of characters within < and >, using the PREG_SPLIT_DELIM_CAPTURE option to capture each value:

    $str = "My <b>table</b> in brown <br> Help";
    $char = preg_split('#(</?[a-z]+>|[a-z ])#ui', $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
    print_r($char);
    

    Output:

    Array (
      [0] => M
      [1] => y
      [2] =>
      [3] => <b>
      [4] => t
      [5] => a
      [6] => b
      [7] => l
      [8] => e
      [9] => </b>
      [10] =>
      [11] => i
      [12] => n
      [13] => 
      [14] => b
      [15] => r
      [16] => o
      [17] => w
      [18] => n
      [19] =>
      [20] => <br>
      [21] =>
      [22] => H
      [23] => e
      [24] => l
      [25] => p 
    )
    

    Demo on 3v4l.org