Search code examples
shellsortingtext-filesuniquepatch

Simple diff/patch script for sorted unique file


How could I write a simple diff resp. patch script for applying additions and deletions to a list of lines in a file?

This could be a original file (it is sorted and each line is unique):

a
b
d

a simple patch file could look like this (or somehow as simple):

+ c
+ e
- b

The resulting file should look like (or in any other order, since sort could be applied anyways):

a
c
d
e

The normal patch formats can not be used since they include context, which might alter in this case.


Solution

  • Bash alternatives that read input files only once:

    To generate patch you can:

    comm -3 a.txt b.txt | sed 's/^\t/+ /;t;s/^/- /'
    

    Because comm delimeters outputs from different files using tab, we can use that tab to detect if line should be added or removed.

    To apply patch you can:

    { <patch.txt tee >(grep '^+ ' | cut -c3- >&5) |
    grep '^- ' | cut -c3- | comm -13 - a.txt; } 5> >(cat)
    

    The tee splits the input, that is the patch file, into two streams. The first part has + filtered and is outputted to file descriptor 5. The file descriptor 5 is opened to just >(cat) so it is just outputted on stdout. The second part has the minus - filtered and it is joined with a.txt and outputted. Because output should be line buffered, it should work.