Search code examples
gitcommit

How to commit only part of files?


As is said in

git - How do I commit only some files? - Stack Overflow

we can use

 git commit [--only] a b c -m "only part of files"

However in the following example:

$ mkdir t
$ cd t
$ git init
Initialized empty Git repository in /mnt/c/test/git-test/t/.git/
$ touch a b
$ git add .
$ git commit a -m a
[master (root-commit) c7939f9] a
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 a
$ git commit b -m b
[master cf4514a] b
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 b
$ git status
On branch master
nothing to commit, working tree clean
$ ls
a  b

I tried to commit only the file b into the second commit but failed. (With a, b in working tree and working tree clean. This implies the two files are both committed.)

So how to truly commit part of files?

Even git add a single file doesn't work:

$ mkdir t
$ cd t
$ git init
Initialized empty Git repository in /mnt/c/test/git-test/t/.git/
$ touch a b
$ git add a
$ git commit --only a -m "a"
[master (root-commit) 04383c9] a
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 a
$ git rm --cached -r .
rm 'a'
$ git add b
$ git commit --only b -m "b"
[master d518916] b
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 b
$ git checkout -f head~
Note: switching to 'head~'.

...
HEAD is now at 04383c9 a
$ ls
a
$ git checkout -f master
Previous HEAD position was 04383c9 a
Switched to branch 'master'
$ ls
a  b

File a is still in the second commit.

Background: Say I have a folder with many many files, and I want to commit file set A into the first commit (i.e. The first commit contains only file set A), set B into the second commit,... Why I do this: Just for curiosity.


Solution

  • To add to Mark Adelsberger's answer and address your comment here:

    I used to take commit as a snapshot but not diff content. So I expect the commit command just takes a snapshot of the index and stores it.

    This is correct. However, when you use git commit --only, the way Git achieves this is complicated. (It's also not well documented.)

    I normally talk about "the" index / staging-area / cache. Git does have one particular distinguished index, "the" index, although it is actually per-work-tree: if you run git worktree add, you not only get a new work-tree, but also a new index (and new HEAD, and other work-tree-specific refs such as those for git bisect). But Git is capable of working with additional temporary index files, and this is what git commit --only and git commit --include do.

    Let's look at your setup again:

    $ mkdir t
    $ cd t
    $ git init
    Initialized empty Git repository in /mnt/c/test/git-test/t/.git/
    $ touch a b
    $ git add .
    

    At this point, "the" index (the main one in .git/index) contains two files. Here they are:

    $ git ls-files --stage
    100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       a
    100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       b
    

    Now, however, you run git commit a -m a, creating the initial commit (a root commit, with no parents). This command—git commit --only a, more or less—works by:

    1. creating a new temporary index, .git/indexdigits;
    2. initializing that index from the current commit;1
    3. running the equivalent of GIT_INDEX_FILE=.git/indexdigits add a;
    4. running the equivalent of cp .git/index .git/index.moredigits to create a second temporary index;
    5. running the equivalent of GIT_INDEX_FILE=.git/index.moredigits add a;2
    6. building a commit from the first temporary index, in the way git commit normally builds a commit from the main index;3 and
    7. finishing off the commit by renaming the second temporary index to .git/index, so that it becomes the primary index.

    What this does is:

    • Create and use a temporary index for the commit that contains the HEAD commit plus the --only files. The main index is undisturbed in case the new commit fails (though in your case it succeeds).
    • Create and set up a second temporary index to be used if the commit succeeds.
    • Attempt the commit using the first temporary index.

    If the commit succeeds, the first temporary index is discarded and the second temporary index becomes the main index (via a rename operation so that it's all atomic). If the commit fails, both temporary index files are removed.

    This means that after a successful git commit --only, the main index is updated as if you had run git add on the --only files. After a failed one—the commit can fail due to pre-commit hooks, or you erasing the commit message, for instance—all is as if you had never run git commit --only at all.

    (In your case, since you didn't modify the file a before running git commit --only a, you can't tell some of these cases apart.)

    When you went on to run git commit --only b, these steps repeated but with file b instead of file a.


    1There is no current commit, as you haven't created any yet, so this is treated as a special case: Git creates this as an empty index.

    2This git add winds up having no effect, since the file named a is still empty. Had you modified the file named a in your work-tree at this point, though, it would have updated the second temporary index.

    3Since Git is not using the file .git/index to build this new commit, any pre-commit hook that assumes that the index is named .git/index will do the wrong thing. Note that with added work-trees, the main index for that added work-tree has a different name as well (.git/worktrees/<name>/index, if I remember correctly offhand).