Say I have an encoding:
Encoding enc;
When this encoding is passed to me, it is set up so it will emit a BOM. I'm not interested in BOMs. Encodings in my system are handled with headers.
Assuming encodings are immutable... I'd like to create a new encoding that exactly matches the existing encoding, but will no longer emit a BOM.
This is so I can avoid the following mismatch:
var data = "áéíóúñ";
var enc = Encoding.UTF8;
long count1 = (long) enc.GetByteCount(data);
long count2;
using(var ms = new MemoryStream())
using(var sw = new StreamWriter(ms, enc))
{
sw.Write(data);
sw.Flush();
count2 = ms.Length;
}
count1.Dump(); //12
count2.Dump(); //15 , oops... BOM was also written
var enc = UTF8Encoding(false); // UTF-8 without BOM
If you don't know the encoding in advance, then you need a bit of extra logic, e.g.
switch(enc.CodePage) {
case 65001:
enc = UTF8Encoding(false);
break;
case 1200:
enc = UnicodeEncoding(false, false);
break;
case 1201:
enc = UnicodeEncoding(true, false);
break;
case 12000:
enc = UTF32Encoding(false, false);
break;
case 12001:
enc = UTF32Encoding(true, false);
break;
default:
// pass through the original enc unchanged
}