Search code examples
phphtmlparsingmarkdownhtmlpurifier

Result from PHP markdown parser is string, not valid html


Goal

I am trying to display links that a user has entered as either markdown or html into a description. The description is saved in a database, and then when its read, I'm trying to parse it to display as a link (rather than the literal markup/markdown).

Problem

I'm using HTML Purifier to parse markdown that is stored in the database. When I run the string through the parser, the result on the page is not valid HTML, but instead the correct HTML simply inside a string.

$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Allowed', 'a[href]');
$config->set('AutoFormat.Linkify', true);
$config->set('HTML.TargetBlank', true);
$config->set('HTML.TargetNoreferrer', true);

//My database result
$subrow['description'];

$purifier = new HTMLPurifier($config);
printf("<br />%s<br />", $purifier->purify($subrow['description'));

Currently the output is literally: "A link <a href="https://url.com">my link</a>"

enter image description here

Screenshot from the chrome dev tools


Solution

  • I think the encoding is changed inside of purifier... only a guess as I have never used it. I was able to mimic your results with the following:

    $test = htmlentities("A link <a href=\"https://url.com/my link\">mylink</a>");
    printf('<br />%s<br />', $test);
    

    To get the valid markup back, I used html_entity_decode():

    printf('<br />%s<br />', html_entity_decode($test));
    

    Try

    printf("<br />%s<br />", html_entity_decode($purifier->purify($subrow['description'])));
    

    Does that help?