Search code examples
phpregexpseudocode

How to to parse pseudocode similar to BBCode in PHP?


I am working with template files that contain lines like these:

[field name="main_div" type='smallblock' required="yes"]
[field type='bigblock' color="red" name="inner_div"]
[field name="btn" type='button' caption='Submit']

mixed with HTML lines.

It's pseudocode for html code generation according to attribute values.

I have limited set of attributes, but don't control their order in string and presence of all of them. Sometimes "required" attribute is set, sometimes is missed, for example.

What is the easiest and convenient way to parse such strings, so I can work with attributes as associative array?

Regular expression, finite state machine, get substring from [ to ], explode by space and explode by equal sign?

Looking for advice or simple piece of code that can work with provided example.


Solution

  • Regular expressions. While you could write a parser for schemes like this, it's overkill and provides no resiliency against garbled tokens.

    The trick is to use two regular expressions, one for finding the [field] tokens and a second to split out the attributes.

    preg_replace_callback('/\[(\w+)(\s+\w+=\pP[^"\']*\pP)*\]/', "block", $);
    
    function block($match) {
    
        $field = $match[1];
    
        preg_match_all('/(\w+)=\pP([^"\']+)\pP/', $match[2], $attr);
        $attr = array_combine($attr[1], $attr[2]);
    
        // ...
        return $html;
    }