I have encountered some odd characters that do not display properly in Internet Explorer, such as these: “, –, and ’. I think they're carried over from copy-and-paste Word content.
I am using XSLT to build the page content and it would be great to detect these characters in the XSLT and replace them with valid HTML codes. I already do string replacement in the style sheet, but I'm not sure how detect these encoded characters or whether it's possible.
What about simply changing the encoding for the Stylesheet as well as its output to UTF-8? The characters you mention are “
, –
and ’
. Certainly not invalid or so, given the correct encoding (the characters are at least perfectly valid in Codepage 1252).