Search code examples
pythondiff

Python difflib: highlighting differences inline?


When comparing similar lines, I want to highlight the differences on the same line:

a) lorem ipsum dolor sit amet
b) lorem foo ipsum dolor amet

lorem <ins>foo</ins> ipsum dolor <del>sit</del> amet

While difflib.HtmlDiff appears to do this sort of inline highlighting, it produces very verbose markup.

Unfortunately, I have not been able to find another class/method which does not operate on a line-by-line basis.

Am I missing anything? Any pointers would be appreciated!


Solution

  • For your simple example:

    import difflib
    def show_diff(seqm):
        """Unify operations between two compared strings
    seqm is a difflib.SequenceMatcher instance whose a & b are strings"""
        output= []
        for opcode, a0, a1, b0, b1 in seqm.get_opcodes():
            if opcode == 'equal':
                output.append(seqm.a[a0:a1])
            elif opcode == 'insert':
                output.append("<ins>" + seqm.b[b0:b1] + "</ins>")
            elif opcode == 'delete':
                output.append("<del>" + seqm.a[a0:a1] + "</del>")
            elif opcode == 'replace':
                raise NotImplementedError("what to do with 'replace' opcode?")
            else:
                raise RuntimeError("unexpected opcode")
        return ''.join(output)
    
    >>> sm= difflib.SequenceMatcher(None, "lorem ipsum dolor sit amet", "lorem foo ipsum dolor amet")
    >>> show_diff(sm)
    'lorem<ins> foo</ins> ipsum dolor <del>sit </del>amet'
    

    This works with strings. You should decide what to do with "replace" opcodes.