I need to create System.String
from file with some unknown ASCII-compatible 1-byte encoding to replace some numbers in text with regex, but Encoding.ASCII
is 7-bit, and Utf-8 is multi-byte so it won't round-trip back to same byte sequence.
Is there encoding in .Net Core which can round-trip any byte sequence?
UPD: Windows-1256 Character set looks promising, but it Windows only.
Using ISO-8859-1
will map directly to Latin-1 Supplement Unicode block and back again (roundtrip). And it is one of encodings .NET Core supports by default.
// C#
var enc = Encoding.GetEncoding(28591); // ISO-8859-1 (code page 28591)
var b = Enumerable.Range(0, 0xFF + 1).Select(x => (byte)x).ToArray();
enc.GetBytes(enc.GetString(b)).SequenceEqual(b) == true
More over each char
will have equivalent byte
value
// F#
let bytes = [| Byte.MinValue .. Byte.MaxValue |]
let chars = Encoding.Latin1.GetChars bytes
Array.map byte chars = bytes
val it: bool = true