Search code examples
phphtmlregexpreg-replacebbcode

Replace BBCodes in HTML codes and vice versa


I have a sentence with BBCodes and I would like to replace it with HTML codes:

$sentence = '[html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div][/html]';


$htmlTags = '<$1>$2</$3>';
$bbTags = '/\[(.*)\](.*)\[\/(.*)\]/'; 


$new = preg_replace($bbTags, $htmlTags, $sentence);
echo $new;

The output is:

<html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div></html>

So it does not cover the whole sentence.

I do not want to place an array of codes with their replacements

PS: The sentence could be changed, from case to case basis


Solution

  • You can use the following PHP code:

    <?php
    
    $sentence = '[html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div][/html]';
    
    $rx = '~\[((\w+)\b[^]]*)\]((?>(?!\[\2\b).|(?R))*)\[\/\2]~s';
    $tmp = '';
    while (preg_match($rx, $sentence) && $tmp != $sentence) {
        $tmp = $sentence;
        $sentence = preg_replace($rx, '<$1>$3</$2>', $sentence);
    }
    $sentence = preg_replace('~\[([^]]*)]~', '<$1 />', $sentence);
    echo $sentence;
    

    Output:

    <html style="font-size: 18px;" dir="ltr">
    <div style="font-size: 18px;" dir="ltr">
      <p style="font-weight: bold;">Hello,</p>
      <p>You have got a new message from <a href="https://www.example.com/">Example.com</a><br /><br />.You could check your message on <a href="https://www.example.com/en/manager/inbox.html">Manager</a></p>
      <p><img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px" />
        <div style="color: #D4192D; font-weight: bold;">Example.com Team</div>
      </p>
    </div>
    </html>

    See the regex demo #1 and regex demo #2.

    Details:

    • \[ - a [ char
    • ((\w+)\b[^]]*) - Group 1 ($1): one or more word chars (captured into Group 2), then a word boundary and zero or more chars other than ] char
    • ] - a ] char
    • ((?>(?!\[\2\b).|(?R))*) - Group 3 ($3): any char that is not a starting point of a [ + Group 2 (as a whole word) char sequence, or the whole pattern recursed
    • \[\/\2] - [/ string, Group 2 value, ] char.

    This is the pattern that handled paired tags. The second pattern handles non-paired tags:

    • \[ - a [ char
    • ([^]]*) - Group 1 ($1): any zero or more chars other than ]
    • ] - a ] char.