Mapping line numbers across two diff files using emacs/python/winmerge

Consider the following two files that are slightly different:

`foo` (old version):

<Line 1> a
<Line 2> b
<Line 3> c
<Line 4> d

`foo` (new version):

<Line 1> a
<Line 2> e
<Line 3> b
<Line 4> c
<Line 5> f
<Line 6> d

As you can see, characters e and f are introduced in the new file.

I have a set of line numbers corresponding to the older file…say, 1, 3, and 4 (corresponding to letters a, c, and d).

Is there a way to do a mapping across these two files, so that I can get the line numbers of the corresponding characters in the newer file?

E.G., the result would be:

Old file line numbers (1,3,4) ===> New File line numbers (1,4,6)

Unfortunately I have only emacs (with a working ediff), Python, and winmerge at my disposal.

Solution

What you need is a string searching algorithm where you have multiple patterns (the lines from the old version of foo) that you want to search for within a text (the new version of foo). The Rabin-Karp algorithm is one such algorithm for this sort of task. I've adapted it to your problem:

def linematcher(haystack, needles, lineNumbers):
    f = open(needles)
    needles = [line.strip() for n, line in enumerate(f, 1) if n in lineNumbers]
    f.close()

    hsubs = set(hash(s) for s in needles)
    for n, lineWithNewline in enumerate(open(haystack), 1):
        line = lineWithNewline.strip()
        hs = hash(line)
        if hs in hsubs and line in needles:
            print "{0} ===> {1}".format(lineNumbers[needles.index(line)], n)

Assuming your two files are called old_foo.txt and new_foo.txt then you would call this function like this:

linematcher('new_foo.txt', 'old_foo.txt', [1, 3, 4])

When I tried in on your data it printed:

1 ===> 1
3 ===> 4
4 ===> 6

Mapping line numbers across two diff files using emacs/python/winmerge

foo (old version):

foo (new version):

`foo` (old version):

`foo` (new version):