Search code examples
pythondifference

Difference checker of two files and display what line is different


Hello I have this code I have been working on,

I have two files standard.txt, new.txt

standard.txt has: ABC123 ABC003 ABC004 new.txt has: ABC123 ABC004

I was able to display the difference in files but I am interested in actually displaying what line has the difference. If someone could help take a look, and perhaps give me an example of what I am doing wrong, that would be very helpful my code is:

def open_file_and_return_list(file_path):
    list = []
    with open(file_path, 'r') as f:
        line = f.readline()
        while line:
            list.append(line)
            line = f.readline()
    return list

def clean_new_line(list):
    for i in range(len(list)):
        if "\n" in list[i]:
            list[i] = list[i].replace("\n", "")
    return list


if __name__ == "__main__":
    list1 = open_file_and_return_list(r"C:\Users\a\a\b\file_compare\new.txt")
    list2 = open_file_and_return_list(r"C:\Users\a\a\b\file_compare\standard.txt")
    list1 = clean_new_line(list1)
    list2 = clean_new_line(list2)
    diff = []
    for obj in list1:
        if obj not in list2:
            diff.append(obj)
    for obj in list2:
        if obj not in list1:
            diff.append(obj)

    print(diff)

    diff_file = input("\nINFO: Select what to name the difference(s) : ")
    with open(diff_file, 'w') as file_out:
        for line in diff:
            file_out.write("** WARNING: Difference found in New Config:\n " + line + "\n")
            print("WARNING: Difference in file: " + line)

For example the files I am comparing are two config files, so the differences might be shown on two different lines, and therefore I do not want to show each difference as 1, 2, 3, but instead say for example Difference found on Line 105: *****difference***

Maybe I need to do something lime this?

for i,lines2 in enumerate(hosts1):
if lines2 != lines1[i]:
    print "line ", i, " in hosts1 is different \n"
    print lines2
else:
    print "same"

and use enumerate?


Solution

  • enumerate and zip are your friends here. To obtain the differences I would do something like:

    # Make sure both lists of lines are same length (for zip)
    maxl = max(len(list1), len(list2))                                          
    list1 += [''] * (maxl - len(list1))                                         
    list2 += [''] * (maxl - len(list2))                                         
    
    for iline, (l1, l2) in enumerate(zip(list1, list2)):
        if l1 != l2:
            print(iline, l1, l2)
    

    Also, (1) you should never use list as a variable name, as it is a built-in class name in python, and (2) to obtain all lines from a file there is a one-liner as:

    lines = open('path_to_file').read().splitlines()