Search code examples
c#filestream

Reliable approach to track new text in a text file


I'm trying to make a program to track changes in a list of text files (only appending type of changes). Working with FileStream class I encounter ArgumentException telling "offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of source collection". Also, I was surprised to see that I can not use long to mark an offset - how do I read giant files then? PS: There won't be giant amounts of new text for sure.

public partial class mainForm : Form
{
    FileSummary initialSnap;
    public mainForm()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {
        if (openFileDialog1.ShowDialog() == DialogResult.OK)
        {
            filePath_textBox.Text = openFileDialog1.FileName;
        }
    }

    private void checkButton_Click(object sender, EventArgs e)
    {
        if (initialSnap == null)
        {
            initialSnap = new FileSummary(filePath_textBox.Text);
            return;
        }

        FileSummary newSnap = new FileSummary(initialSnap.FullName);
        var dateBefore = initialSnap.LastWriteTime;
        var dateAfter = newSnap.LastWriteTime;
        if (dateBefore == dateAfter) return;

        var deltaLength = newSnap.Length - initialSnap.Length;
        var prevLength = (int)initialSnap.Length;
        using (FileStream fstream = File.OpenRead(initialSnap.FullName))
        {
            byte[] array = new byte[deltaLength];
            fstream.Read(array, prevLength, array.Length);
            string addedText = System.Text.Encoding.Default.GetString(array);
        }
    }
}

internal class FileSummary
{
    public FileSummary(string fileFullPath)
    {
        FullName = fileFullPath;
        FileInfo fi = new FileInfo(fileFullPath);
        LastWriteTime = fi.LastWriteTime;
        Length = fi.Length;
    }

    public string FullName { get; internal set; }
    public DateTime LastWriteTime { get; internal set; }
    public long Length { get; private set; }
}

Solution

  • The call to the fstream.Read() is not correct in your code. The Read method 2nd argument is the offset at which the target byte array where the file contents will be placed, it is not the starting offset point in the file as you misunderstood.

    This line caused the error,

     fstream.Read(array, prevLength, array.Length);
    

    It must be changed to ,

     fstream.Seek(prevLength , SeekOrigin.Begin);
     fstream.Read(array, 0, array.Length);
    

    So, if you needed to get only the portion that was newly written to the file, you need to first seek to that previous end position and read the delta file content, and then fill in your byte array. Refer the Read() documentation. And that is why, the Read() method took an int as offset instead of a long, because the offset was for the array buffer, not for the file.