Search code examples
phpparsingevalphp-5.3

Securing snippet with dropping eval() in input-file parsing


I have a template-esque system which can load bulk templates (more than one template entry in one file) and store them accordingly. The problem is that the current approach uses preg_replace() and eval and it is really error-prone. An example for this error could be an improperly-placed character which breaks the regular expression and creates a parse error:

Parse error: syntax error, unexpected '<' in tsys.php: eval()'d code

The code which does this said loading is the following:

// Escaping
$this->_buffer = str_replace( array('\\', '\'', "\n"), array('\\\\', '\\\'', ''), $this->_buffer);

// Regular-expression chunk up the input string to evaluative code
$this->_buffer = preg_replace('#<!--- BEGIN (.*?) -->(.*?)<!--- END (.*?) -->#', "\n" . '$this->_tstack[\'\\1\'] = \'\\2\';', $this->_buffer);

// Run the previously created PHP code
eval($this->_buffer);

An example file of this bulk template looks like the following:

<!--- BEGIN foo -->
<p>Some HTML code</p>
<!--- END foo -->

<!--- BEGIN bar -->
<h1>Some other HTML code</h1>
<!--- END bar -->

When the code is ran on this input, the $this->_tstack will be given two elements:

array (
  'foo' => "<p>Some HTML code</p>",
  'bar' => "<h1>Some other HTML code</h1>",
);

Which is the expected behavior but I am looking for a method which we could drop the need of eval.


Solution

  • Well, here goes. Given $template contains:

    <!--- BEGIN foo -->
        <p>Some HTML code</p>
    <!--- END foo -->
    
    <!--- BEGIN bar -->
        <h1>Some other HTML code</h1>
    <!--- END bar -->
    

    Then:

    $values = array();
    $pattern = '#<!--- BEGIN (?P<key>\S+) -->(?P<value>.+?)<!--- END (?P=key) -->#si';
    if ( preg_match_all($pattern, $template, $matches, PREG_SET_ORDER) ) {
        foreach ($matches as $match) {
            $values[$match['key']] = trim($match['value']);
        }
    }
    var_dump($values);
    

    Results in:

    array(2) {
      ["foo"]=>
      string(21) "<p>Some HTML code</p>"
      ["bar"]=>
      string(29) "<h1>Some other HTML code</h1>"
    }
    

    If white space preservation is important, remove trim().