Search code examples
phppdopreg-replacespecial-charactersxml-entities

Only convert <, >, &, ' and " for XML?


It seems I have another problem with special character and double quotes and so on after this question that has been solved earlier.

I used to use this function that convert symbol like '&' to numberic code for XML,

function convert_specialchars_to_xmlenties($string) 
{ 

    # in order to convert  <, >, &, ' and ", include them into the square brackes [<\'"&>\x80-\xff]
    $output = preg_replace('/([<\'"&>\x80-\xff])/e', "'&#' . ord('$1') . ';'", $string);

    # return the result
    return $output; 
}

So if my input is Judge-Fürstová Mila & Judge-Fürstová Mila

I will get Judge-F&#252;rstov&#225; Mila &#38; Judge-F&#252;rstov&#225; Mila

But I think since I am using PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8" to handle all my special characters, so if my input is something like

Judge-Fürstová Mila & Judge-Fürstová Mila

now will returns,

Judge-F&#195;&#188;rstov&#195;&#161; Mila &#38; Judge-F&#195;&#188;rstov&#195;&#161; Mila

Which is incorrect for XML I think.

So I think I should just convert <, >, &, ' and " only but not other special characters like ü or á

Any ideas I how I can do this? Or maybe I have thought/ understood the problem incorrectly and there are other better ways to solve this problem?

EDIT:

I was wrong, as I just changed the function which only converts <, >, &, ' and "

$output = preg_replace('/([<\'"&>])/e', "'&#' . ord('$1') . ';'", $string);

XML still does not accept the converted code below,

Judge-Fürstová Mila &#38; Judge-Fürstová Mila

I cannot think of any other reason why it does that! Any ideas?


Solution

  • You want htmlspecialchars(). Don't let the name throw you off. It by default converts only the characters you've listed.