Please look at the code below. This program simply saves a 33-character-length string "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" with an additional byte value of "33".
using System.Text;
namespace Test
{
internal class Program
{
static void Main(string[] args)
{
string filepath = args[0];
using (var stream = File.Open(filepath, FileMode.Create))
{
using (var writer = new BinaryWriter(stream, Encoding.UTF8, false))
{
writer.Write(new string('!', 33));
writer.Write((byte)33);
}
}
using (var stream = File.Open(filepath, FileMode.Open))
{
using (var reader = new BinaryReader(stream, Encoding.UTF8, false))
{
Console.WriteLine(reader.ReadString());
Console.WriteLine(reader.ReadByte());
}
}
Console.ReadKey();
}
}
}
And here is the binary representation of it:
Apparently, the first starting "ox21" is the length of the string - but how on earth does C# know?
More than 1 year after my original question, I finally found an answer - when BinaryWriter writes the string, it first writes the length of the string.
using static System.Text.Encoding;
var test = "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!";
Console.WriteLine(UTF8.GetBytes(test).Length); // We get 33, meaning the string itself is only 33 characters.
In the original question, the first ox21 means the length of the string - it happens to correspond to ASCII letter !
.
Notice that in the original image, the hex values are a total of 35 bytes long! This is because that includes:
Write()
output (x1 byte, value = 33)Notice what happens when the length of the string is more than 1 byte:
using static System.Text.Encoding;
var longString = new string('!', 256);
UTF8.GetBytes(longString); // Gives an array with length 256
var memory = new MemoryStream();
var writer = new BinaryWriter(memory);
writer.Write(longString);
memory.ToArray(); // Gives the serialized bytes, which is 258 in length!
The first three bytes from above (little endian): 128 2 33
See this for a comprehensive overview: How BinaryWriter.Write() write string