Search code examples
phpregexshortcoderegular-languagebbcode

Shortcodes/BBcode regular expression


I have an issue with writing the correct regular expression.

I'm using shortcodes in my system and they are working just great. I've sorted it's attributes etc., but now I want to use a shortcode inside of the other shortcode.

Here's how I'm preparing the regular expression:

$attributes_regexp = "([^\]]*?)";
$inner_content_regexp = "(.*?)";
$flags_regexp = "im";
$regexp = "/\[$shortcode$attributes_regexp\]$inner_content_regexp\[\/$shortcode\]/$flags_regexp";
preg_match_all($regexp, $content, $found_occurrences);

Here's how an example of a ready regular expression looks like:

\[file([^\]]*?)\](.*?)\[\/file\]

And here's a bit of HTML that have to be analysed:

<div class="row">
<div class="col-md-8">
<h2>Test page</h2>
<p>&nbsp;</p>
<p><strong>Some</strong> content</p>
<p>Lorem ipsum dolor.&nbsp;</p>
<p>Dolor sit amet.</p>
<p>[file id=290 type=link][file id=283 type=image width=100 height=100][/file][/file]</p>
</div>
<div class="col-md-3 offset-md-1">
<p>[file id=289 type=image][/file]</p>
</div>
</div>

The problem is that it's getting correctly only the last one changing it to image, but the previous one is taken like

[file id=290 type=link][file id=283 type=image width=100 height=100][/file]

Instead of two separate ones

[file id=283 type=image width=100 height=100][/file]

and

[file id=290 type=link][/file]

Any ideas how this can be sorted?

Many thanks, Tomasz


Solution

  • If the data only brakes the XML standard with the tag separators [ and ] instead of < and > you could turn the data into XML and use a XML-parser for further analysis:

    $regex = "/(\[{$shortcode}.+\[\/{$shortcode}\])/";
    if (preg_match_all($regex, $content, $matches)) {
        array_shift($matches); //removes $matches[0], which contains the whole $content again
        foreach ($matches as $match) {
            //The following line should turn your data into valid XML
            $xml = str_replace(['[', ']'], ['<', '>'], $match);
            //Some XML parsing like:
            $xmlObject = new SimpleXMLElement($xml);
            //...
        }
    }
    

    Like this you do not have to invent the wheel again.