I am cleaning a HTML file using HTML Tidy, well the .NET version called TidyManaged, and my "£" symbols are being converted to "?"
ie:
Income (£)
becomes:
Income (�)
I believe it is to do with encoding types. In TidyManaged, one can specify the input encoding type and output encoding type, including such things as Latin1, utf8, utf16, win1252.
The XHTML doc will ultimately gets converted into a DOC which uses win1252.
So what should my input and output encoding be to preserve £ symbols?
Many thanks.
Well, when I've used other char-sets it's always different. I'm not fluent in them but I do know that to create symbols, punctuation you need to use a 'code' rather than their literal. Never seen win1252 but google says it's 0x00A3
.
Try putting that somewhere in your document.
I know in html I would put £
for a pound sign. So Html:
<p>£0.00</p>