I am parsing a large file , I like to monitor the process by showing how many bytes that have been read. The actual code is massive but this parts is how i count.
StreamReader sr = new StreamReader(FilePath);
while ((line = sr.ReadLine()) != null )
{
//do parsing jobs
byteCnt += Convert.ToUInt64( line.Length * sizeof(char) );
}
Console.WriteLine(String.Format("{0:n0}", byteCnt) + " Bytes");
The file is 16.9 GB (18,186,477,492 bytes)
but my program counts 34,816,805,164 Bytes
How could this happen? and how to make this number more reasonable?
Thanks
sizeof(char)
is 2 in C# as it uses unicode encoding. If your file is not in unicode, this will not be an accurate measure. You can instead use e.g.
System.Text.ASCIIEncoding.ASCII.GetByteCount(line);
// or another example:
Encoding.UTF8.GetByteCount(line);
To get the size. You need to pick an appropriate solution depending on what the encoding of your file is.