When reading the documentation of git, I seemed to find a contradiction.
In this official tutorial of git, git log -p
is said to show the history of commits together with complete diff info. However, in the documentation of git-log, the -p
option is said to produce a patch file instead of directly output. Also, the description "they do not produce the output described above" is confusing, since the "described above" is very vague, at least for me.
Other than the section given above, I only find one other place mentioning the -p
option, which matches the description in the tutorial instead of the patch part. Also, when I run git log -p
on my computer, its shows commit history together with diff info, and I don't see any patch files generated. So are the two parts of the documentation contradictory? Or do I misunderstood the process of "generating patch files"? Thank you!
There is a lot of Git documentation that is ... suboptimal, shall we say.
It's important to realize that each Git commit saves a snapshot, rather than changes, as this explains the behavior of Git in several trickier cases. Various Git commands—including both git diff
and git log
—can then extract two snapshots and compare them. The result of comparing an older snapshot to a newer snapshot—or a "left side" to a "right side", because you can reverse it and compare the newer to the older instead—is a diff or patch.
The default method of preparing such a diff / patch is to produce a series of instructions that, if obeyed to the letter, will transform each left-side file into the corresponding right-side file. These instructions have the general form of: Expect this particular context to be viewable in both the left and right side files, and then delete any -
lines from the left-side file and add any +
lines from the right-side file. If the left-side file is from the (singular) parent of some commit, and the right-side file is from the commit itself, that tells you what someone changed in that file.
No doubt you have seen this output, and it probably even makes some sense.
The documentation you're reading, however, is compiled automatically from multiple input fragments, and the git log
description to which you linked was written to be read after this other description of the default output of git diff-tree
, which includes this particular text:
in-place edit :100644 100644 bcd1234 0123456 M file0 copy-edit :100644 100644 abcd123 1234567 C68 file1 file2 rename-edit :100644 100644 abcd123 1234567 R86 file1 file3 create :000000 100644 0000000 1234567 A file4 delete :100644 000000 1234567 0000000 D file5 unmerged :000000 000000 0000000 0000000 U file6
Of course, git log -p
does not produce that output at all—so the git log
documentation doesn't include this section. But git log -p
does produce the same output as git diff-tree -p
. When the later section of git diff-tree -p
documentation uses the phrase "do[es] not produce the output described above", it's talking about the :100644 ...
stuff.
Going back to the claim that git log -p
show[s] the history of commits together with complete diff info
—well, this too is false. The problem here is that complete information is too complex for git log -p
. Specifically, merge commits are defined as any commit that has two or more parent commits.
Every commit saves a snapshot of all of your files. But every commit also records some set of parent, or predecessor, commit hash IDs. Most commits have exactly one parent. In this particular—and very common—case, git log
can run git diff
with the (singular) parent on the left side, and the commit (also singular) on the right side. That way you see what changed between parent and child: what the author of that commit changed in that commit.
But there are commits that have two parents. These commits are called merge commits; the git merge
command tends to build them. (We cannot say that it always builds them, because—as is very common with Git commands—git merge
can actually do one of several different tasks, depending on the situation and some command-line arguments.) Given this kind of merge commit, git log
does not just pick one parent, and then show you the diff between that one parent's snapshot and the commit's snapshot. It does not pick the two parents and diff them—that's usually not very sensible and would not tell you anything about the merge result—and it does not even attempt to compare all three commits at the same time, at least not by default.
Instead, what git log
does with a two-parent (or more-than-two-parent) merge commit is to show you the log message, and then not bother to show a diff at all. This is actually the most practical thing to do in most cases, which is why it's what git log
does. But that immediately tells us that we're definitely not getting a full picture!
Note that with a nice simple linear chain of commits:
A <-B <-C ... <-F <-G <-H <--master
what git log
does is start with the last commit–it has some hash ID, but here I'm just going to call it H
—and show you its authorship and log message, and then extract two snapshots, one from parent G
and one from H
itself, and diff them. Then it moves on—or backwards—to commit G
. Now it shows you G
's author and log message, and then extracts the snapshots for F
(parent of G
) and G
and diffs them. This repeats, with Git moving backwards, commit-by-commit, from child commit to parent. It's just at merges where git log
doesn't bother diff-ing at all.
The git show
command is very similar to git log
: it mostly does what git log
does, but only for one commit. That is, if you give git show
the hash ID of commit G
, it will show you G
's author information, and its log message, and a diff from F
to G
, but then just stop there—it won't move on to show F
too. But if you point git show
to a merge commit, it will show a diff, at least sometimes. What it shows is a combined diff, which is described just a bit further on in these manual pages. It's important to note that combined diff still leaves stuff out, on purpose. In particular, pay close attention to the (separate) section of the documentation that mentions that:
combined diff lists only files which were modified from all parents.
This, again, is actually intended to be helpful. Sometimes, it is helpful. The documentation is not very clear about it all, though. In this case it's not clear why git log
shows nothing and git show
produces a combined diff.
What's going on here is that git log
, and git show
, and various other commands can do this sort of special combined-diff thing. But by default, git log
doesn't bother. You can give git log
a -c
or --cc
flag—note that the first one is "one dash, one c" and the second is "two dashes, two c's"—to make git log
produce combined diffs for merges. The git show
command defaults to --cc
behavior.
Last, note that you can, instead, give git log
and git show
a -m
flag. In this case, the commands will treat merges even more specially: for a merge commit C with two parents P1 and P2, the two commands will, in effect, run:
git diff P1 C
git diff P2 C
after showing you the usual header information (author and log message).
In all cases, unless you use --graph
, git log
does not give you enough information to reproduce the actual commit graph—which is crucial for understanding git merge
. But that's for another day...