Search code examples
phpgetelementsbytagnamenodevalue

php getElementsByTagName with nodeValue returns evil characters


I have some utf-8 html like this:

<a href="http://example.com">Today&nbsp;11:12&nbsp;AM</a>

And getElementsByTagName('a')->item(0)->nodeValue returns this:

Today 11:12 AM

I am not having any problems with other nodes in this html.

What am I doing wrong?


Solution

  • Source documents are ASP and IIS.

    I ended up using this for the offending characters:

    str_replace( chr(), chr(), $html);