Search code examples
gitgit-diff

Is there a way to compare a file over all branches?


I know I can use git diff to check for the difference of a certain file across two specified branches.
Can it be done for all branches in a single command?
Would it also include the deleted branches?


Solution

  • Can it be done for all branches in a single command?

    No.

    Would it also include the deleted branches?

    No.

    It's important to realize that in one sense, Git doesn't really have branches. (In several other senses, though, it does.) What Git has are branch names. They're not nearly as important as people might think at first, though they are certainly important.

    What Git is really about is commits. It's not about files, although commits contain files; and it's not about branch names, although branch names identify specific commits. Git is all about commits. When you run git diff br1 br2, you are telling Git to run a diff on two specific commits.

    Every commit stores a full snapshot of all the files that Git knows about—or rather, knew about, at the time you, or whoever, made that commit. That's the main data for each commit. Each commit also stores some metadata, or information about the commit itself, such as who—name and email address—made the commit, when (date and time stamp), and why (their log message for the commit).

    Commits are numbered, but the numbers are random-looking hash IDs, rather than simple counting numbers. The hash IDs are actually entirely non-random, as they are cryptographic checksums of the data-and-metadata inside the commit. Git finds the commits by their hash IDs: there is a big object database inside the repository, with objects being numbered by these hash IDs. (Commits are one of four object types within this object database.)

    Because the hash IDs are checksums, no part of any commit can ever be changed. You mostly add new commits to a repository. (Commits can be forgotten, but only under particular conditions.)

    Each commit stores, as part of its metadata, the hash ID—or sometimes hash IDs, plural—of its immediate predecessor commits. In this way, Git can start with the last commit (of some branch) and work backwards, one commit at a time:

    ... <-F <-G <-H
    

    Here H stands in for the hash ID of the last commit of some chain of commits. Inside commit H, which Git can read out of the object database, there is the hash ID of earlier commit G. That lets Git find G in the object database; inside that object there is the hash ID of earlier commit F. That lets Git find F, which has another hash ID, and so on. This is the history, in the Git repository.

    But this leaves Git with a problem. How will it quickly and easily find the hash ID of the last commit? For instance, in the above, where will Git find hash ID H?

    A branch name solves this problem. Each name holds one (1) hash ID, which is, by definition, the last commit in the chain. So if master holds hash ID H, we have:

    ...--F--G--H   <-- master
    

    If there is another branch name, that other branch name holds the hash ID of some commit—maybe H, maybe G or F, or maybe some commit after H. Perhaps develop holds the hash ID of some commit I, whose parent is G:

    ...--F--G--H   <-- master
             \
              I   <-- develop
    

    So now the name master and the name develop each pick out one specific commit.

    You can run git diff and give it raw hash IDs; or you can run it and give it branch names. When you give it branch names, Git just looks up the names and finds the hash IDs, then runs the diff on the two hash IDs.

    So:

    Is there a way to compare a file over all branches?

    Yes: enumerate all the interesting commits, and then run git diff in whatever way you like. For instance:

    git for-each-ref --format='%(refname:short)' refs/heads
    

    will print (to standard output) each branch name, in the short form (master, develop, etc., rather than as their full names, refs/heads/master, refs/heads/develop, etc.).

    To compare the snapshot of a particular file in commit C1 to that in some commit C2, you would use:

    git diff C1 C2 -- path/to/file
    

    The pathspec argument after the -- limits the diff to just that one file. (The -- itself is optional here; it's generally a good idea to use it out of habit, to avoid ambiguity when you get into more complicated git diff usage.)

    If you want to compare the snapshot in (say) the commit identified by the name master to the snapshot in the commit identified by the name develop:

    git diff master develop -- path/to/file
    

    will do the job. So if you want to compare, one at a time, the commit in master to the commit in each branch:

    git for-each-ref ... |
        while read branch; do
            git diff master $branch -- path/to/file
        done
    

    would do the trick, for instance. Fill in the for-each-ref as seen above.

    Note that for-each-ref will print master, so you'll be running one git diff master master -- path/to/file, which will literally compare the tip commit of master to itself. The file will match, which means git diff will print nothing, but this is slightly wasteful. If you don't like the wastefulness, add code to test whether $branch is master and if so, to skip the git diff step (but note that this test itself also adds a bit of compute work, which for every other name, is slightly wasteful: TANSTAAFL1).


    1There Ain't No Such Thing As A Free Lunch