Is there a way on C# that I can convert unicode strings into ASCII + html entities, and then back again? See, in PHP, I can do it like so:
<?php
// RUN ME AT COMMAND LINE
$sUnicode = '<b>Jöhan Strauß</b>';
echo "UNICODE: $sUnicode\n";
$sASCII = mb_convert_encoding($sUnicode, 'HTML-ENTITIES','UTF-8');
echo "ASCII: $sASCII\n";
$sUnicode = mb_convert_encoding($sASCII, 'UTF-8', 'HTML-ENTITIES');
echo "UNICODE (TRANSLATED BACK): $sUnicode\n";
Background:
Yes, there's Encoding.Convert
, although I rarely use it myself:
string text = "<b>Jöhan Strauß</b>";
byte[] ascii = Encoding.ASCII.GetBytes(text);
byte[] utf8 = Encoding.Convert(Encoding.ASCII, Encoding.UTF8, ascii);
I rarely find I want to convert from one encoded form to another - it's much more common to perform a one way conversion from text to binary (Encoding.GetBytes
) or vice versa (Encoding.GetString
).