Search code examples
c#iobytestreamreader

Count actual bytes had been read


I am parsing a large file , I like to monitor the process by showing how many bytes that have been read. The actual code is massive but this parts is how i count.

StreamReader sr =  new StreamReader(FilePath);
        while ((line = sr.ReadLine()) != null )
        {
            //do parsing jobs

            byteCnt += Convert.ToUInt64( line.Length * sizeof(char) );
        }

 Console.WriteLine(String.Format("{0:n0}", byteCnt) + "  Bytes");

The file is 16.9 GB (18,186,477,492 bytes)

but my program counts 34,816,805,164 Bytes

How could this happen? and how to make this number more reasonable?

Thanks


Solution

  • sizeof(char) is 2 in C# as it uses unicode encoding. If your file is not in unicode, this will not be an accurate measure. You can instead use e.g.

    System.Text.ASCIIEncoding.ASCII.GetByteCount(line);
    // or another example:
    Encoding.UTF8.GetByteCount(line);
    

    To get the size. You need to pick an appropriate solution depending on what the encoding of your file is.