Search code examples
c#visual-studiocomparestringcomparer

Compare two text files and show what is the SAME


I am needing help on comparing two files. I am able to print out what the difference is between the two, but I am not able to figure out how to print what is the SAME between the two when they are wrong. Can anyone assist me? Thanks in advance!

    private void button2_Click(object sender, EventArgs e)
    {
        try
        {
            //Reads files line by line
            var fileOld = File.ReadLines(Old);
            var fileNew = File.ReadLines(New);                              //Ignores cases in files
            IEnumerable<String> OldTxt = fileOld.Except(fileNew, StringComparer.OrdinalIgnoreCase);
            IEnumerable<String> NewTxt = fileNew.Except(fileOld, StringComparer.OrdinalIgnoreCase);

            FileCompare compare = new FileCompare();

            bool areSame = OldTxt.SequenceEqual(NewTxt);

            if (areSame == true)
            {
                MessageBox.Show("They match!");
            }
            else if(areSame == false)
            {
                // Find the set difference between the two files.  
                // Print to Not Equal.txt
                var difFromNew = (from file in OldTxt
                                  select file).Except(NewTxt);

                using (var file = new StreamWriter(NotEqual))
                {
                    foreach (var notEq in difFromNew)
                    {
                        file.WriteLine(notEq.ToString() + "\n", true);
                    }
                }

                MessageBox.Show("They are not the same! Please look at 'Not Equal.txt' to see the difference. Look at 'Equal.txt' to see what is the same at this time.");
            }

        }
        catch
        {
            MessageBox.Show("'NewConvert.txt' or 'OldConvert.txt' files does not exist.");
        }

    }

Solution

  • First, I think there may be a bug in your code. 'fileOld' contains all the contents of the old file, and 'OldTxt' contains just the old file text that is not present in the new file. After my answer, see below for some code cleanup ideas.

    I think you're looking for Intersect, which returns the items that two IEnumerables have in common:

    var commonItems = fileOld.Intersect(fileNew);
    

    Alternatively, since you already have a list of the differences captured in difFromNew, you could use Except again:

    var difFromNew = fileOld.Except(fileNew); // Note I fixed what I think is a bug here
    var commonItems = fileOld.Except(difFromNew);
    

    Some potential bugs:

    1. SequenceEqual means not only are the items the same, but they are in the same order. If you do care about the order, then this is appropriate. The problem is this will make showing the differences difficult, because the other methods you're using to compare the lists (Except and Intersect) do not care about order, only whether or not an item exists. So if fileOld contains items cat | dog and fileNew contains items dog | cat, then they won't be equal, but you also won't be able to show the differences (fileOld.Except(fileNew) will contain 0 items).
    2. In your diffFromNew list, you are taking OldTxt, which is the unique text from the old file, and doing an Except against NewTxt, which is the unique text from the new file. There can be no overlap between these two - in fact OldTxt already contains diffFromNew by default.

    Here's one way you could go about getting the lists you're looking for:

    var oldFilePath = "c:\\oldFile.txt";
    var newFilePath = "c:\\newFile.txt";
    
    var oldFileTxt = File.ReadLines(oldFilePath);
    var newFileTxt = File.ReadLines(newFilePath);
    
    // Determine if they're equal by checking if there are not any unique items
    var filesAreSame = !oldFileTxt.Except(newFileTxt, StringComparer.OrdinalIgnoreCase).Any();
    
    var commonItems = oldFileTxt.Intersect(newFileTxt, StringComparer.OrdinalIgnoreCase);
    var uniqueOldItems = oldFileTxt.Except(commonItems, StringComparer.OrdinalIgnoreCase);
    var uniqueNewItems = newFileTxt.Except(commonItems, StringComparer.OrdinalIgnoreCase);
    

    Here's how the code might look with these changes:

    if (filesAreSame)
    {
        MessageBox.Show("They match!");
    }
    else
    {
        var commonItems = oldFileTxt.Intersect(newFileTxt, StringComparer.OrdinalIgnoreCase);
        var uniqueOldItems = oldFileTxt.Except(commonItems, StringComparer.OrdinalIgnoreCase);
        var uniqueNewItems = newFileTxt.Except(commonItems, StringComparer.OrdinalIgnoreCase);
    
        var notEqualsFileText = new StringBuilder();
        if (uniqueOldItems.Any())
        {
            notEqualsFileText.AppendLine(
                $"Entries in {oldFilePath} that are not in {newFilePath}:");
            notEqualsFileText.AppendLine(string.Join(Environment.NewLine, uniqueOldItems));
        }
        if (uniqueNewItems.Any())
        {
            notEqualsFileText.AppendLine(
                $"Entries in {newFilePath} that are not in {oldFilePath}:");
            notEqualsFileText.AppendLine(string.Join(Environment.NewLine, uniqueNewItems));
        }
    
        File.WriteAllText(notEqualFilePath, notEqualsFileText.ToString());
    
        var equalsFileText = new StringBuilder();
        if (commonItems.Any())
        {
            equalsFileText.AppendLine(
                $"Entries that are common in both {newFilePath} and {oldFilePath}:");
            equalsFileText.AppendLine(string.Join(Environment.NewLine, commonItems));
        }
        else
        {
            equalsFileText.AppendLine(
                $"There are no common entries in both {newFilePath} and {oldFilePath}.");
        }
    
        File.WriteAllText(equalFilePath, equalsFileText.ToString());
    
        MessageBox.Show("The files are not the same! Please look at 'Not Equal.txt' to see the difference. Look at 'Equal.txt' to see what is the same at this time.");
    }