Search code examples
gitdiffgit-diff

Git - Get only the new words from a commit compared to master commit/previous commit


How to get only the new text added to git commit (compared to previous commit or to master). For example, lets say I have text file from a previous commit, with the following content:

file 1:
-----------
hello my first name is john

And the file is edited and pushed, to:

file 1:
-----------
hello my last name is doe

I want to be able to get only the diff words - e.g. in this example to get last doe, to stdout or to a text file.

What is the simplest way to do it?


Solution

  • Ask git to compare words instead of lines with git diff HASH --word-diff:

    $ git diff HEAD^ --word-diff
    diff --git a/file.txt b/file.txt
    index 244f97f..ad2517e 100644
    --- a/file.txt
    +++ b/file.txt
    @@ -1 +1 @@
    hello my [-first-]{+last+} name is [-john-]{+doe+}
    

    Feed output to grep/sed/awk to extract actual words surrounded with {+ and +}. I did this with grep -oP, where -P enables perl-style regexes and -o displays only found parts instead of full line:

    $ git diff HEAD^ --word-diff |grep -oP '(?<=(\{\+)).+?(?=(\+\}))'
    last
    doe
    

    Regex breakdown:

    • (?<=(\{\+)) positive look-behind for {+, so these symbols are required for match, but not included into it
    • .+? lazy search for all symbols. Without laziness will greedily include everything between the very first and very last brackets: last+} name is [-john-]{+doe
    • (?=(\+\})) positive look-ahead for +}, similar to look-behind

    Note: output won't properly include +} and any following text, if these symbols are actually part of your changes.