In the database I have some code like this one
Some text
<pre>
#include <cstdio>
int x = 1;
</pre>
Some text
When I'm trying to use phpQuery to do the parsing it fails because the <cstdio>
is interpreted as a tag.
I could use htmlspecialchars
but to apply it only inside pre
tags I still need to do some parsing. I could use regex but it will be much more difficult (I will need to handle the possible attributes of the pre
tag) and the idea of using a parser was to avoid this kind of regex thing.
What's the best way to do what I need to do ?
I finally went the regex way, considering only simple attributes for the pre
tag (no '>' inside the attributes) :
foreach(array('pre', 'code') as $sTag)
$s = preg_replace_callback("#\<($sTag)([^\>]*?)\>(.+?)\<\/$sTag\>#si",
function($matches)
{
$matches[3] = str_replace(array('&', '<', '>'), array('&', '<', '>'), $matches[3]);
return "<{$matches[1]} {$matches[2]}>".htmlentities($matches[3], ENT_COMPAT, "UTF-8")."</{$matches[1]}>";
},
$s);
It also deals with caracters being already converted to html entities (we don't want to have it twice).
Not a perfect solution but given the data I need to apply it on it will do the work.