Compare images line by line using vips

Background: I have images I need to compare for differences. The images are large (on the order of 1400x9000 px), machine-generated and highly constrained (screenshots of a particular piece of linear UI), and are expected to be nearly identical, with differences being one of the following three possibilities:

Image 1 has a section image 2 is missing
Image 1 is missing a section image 2 has
Both images have the given section, but its contents differ

I'm trying to build a tool that highlights the differences for a human reviewer, essentially an image version of line-oriented diff. To that end, I'm trying to scan the images line by line and compare them to decide if the lines are identical. My ultimate goal is an actual diff-like output, where it can detect that sections are missing/added/different, and sync the images up as soon as possible for the remaining parts of identical content, but for the first cut, I'm going with a simpler approach where the two images are overlaid (alpha blended), and the lines which were different highlighted with a particular colour (ie. alpha-blended with a third line of solid colour). At first I tried using Python Imaging Library, but that was far several orders of magnitude too slow, so I decided to try using vips, which should be way faster. However, I have absolutely no idea how to express what I'm after using vips operations. The pseudocode for the simpler version would be essentially:

out = []
# image1 and image2 are expected, but not guaranteed to have the same height
# they are likely to have different heights if different
# most lines are entirely white pixels
for line1, line2 in zip(image1, image2):
    if line1 == line2:
        out.append(line1)
    else:
        # ALL_RED is a line composed of solid red pixels
        out.append(line1.blend(line2, 0.5).blend(ALL_RED, 0.5))

I'm using pyvips in my project, but I'm also interested in code using plain vips or any other bindings, since the operations are shared and easily translated across dialects.

Edit: adding sample images as requested

Edit 2: full size images with missing/added/changed sections:

Solution

How about just using diff? It's pretty quick. All you need to do is turn your PNGs into text a scanline a time, then parse the diff output.

For example:

#!/usr/bin/env python3

import sys
import os
import re
import pyvips

# calculate a checksum for each scanline and write to name_out    
def scanline_checksum(name_in, name_out):
    a = pyvips.Image.new_from_file(name_in, access="sequential")
    # unfold colour channels to make a wider 1-band image
    a = a.bandunfold()
    # xyz makes an index image, where the value of each pixel is its coordinate
    b = pyvips.Image.xyz(a.width, a.height)
    # make a pow gradient image ... each pixel is some power of the x coordinate
    b = b[0] ** 0.5
    # now multiply and sum to make a checksum for each scanline
    # "project" returns sum of columns, sum of rows
    sum_of_columns, sum_of_rows = (a * b).project()
    sum_of_rows.write_to_file(name_out)

to_csv(sys.argv[1], "1.csv")
to_csv(sys.argv[2], "2.csv")

os.system("diff 1.csv 2.csv > diff.csv")

for line in open("diff.csv", "r"):
    match = re.match("(\\d+),(\\d+)c(\\d+),(\\d+)", line)
    if not match:
        continue
    print(line)

For your two test images I see:

$ time ./diff.py 1.png 2.png 
264,272c264,272
351,359c351,359
real    0m0.346s
user    0m0.445s
sys 0m0.033s

On this elderly laptop. All you need to do is use those "change" commands to mark up your images.