Search code examples
gitpushblame

How do I to identify forced pushes in Git?


By looking into my team's commit log, I found lines of code changes from same patch appear twice in different commits.

Further prove is like this: Master branch has commit A in its change list(as being seen in "git log"). If I check out commit A as HEAD, I can find commit B in its change list. However if I checkout master as HEAD, I can not find commit B in its change list.

I believe there was a non-fast-forward push happened at some point. But is there any easy way to identify such non-fast-forward push by using git command or other tools?


Solution

  • If you're using Github (and possibly others), a forced push on a pull request will be visible in the PR history, though I'm not sure how long they record it.

    Using the Github v4 API forced pushes should be available on a PullRequest's timeline as a BaseRefForcePushedEvent and/or HeadRefForcePushedEvents.


    You may be able to detect a forced push if you have access to the bare remote repository, but it's a complicated. I'll illustrate below.

    Normally pushes are building on top of existing commits. If you had a local repo a few commits ahead of your remote like so...

    origin
    
    A - B - C [master]
    
    local
    
    A - B - C [origin/master]
             \
              D - E [master]
    

    And you git push D and E will be send up and the local and remote will have the same history for master.

    origin
    
    A - B - C - D - E [master]
    
    local
    
    A - B - C - D - E [master][origin/master]
    

    But if local does some rebasing, let's say they rewrite B, history has "diverged".

    origin
    
    A - B - C - D - E [master]
    
    local
    
    A - B - C - D - E [origin/master]
     \
      B1 - C1 - D1 - E1 [master]
    

    Now if you try to push it will be rejected, git push will not merge for you it will only fast-forward. This is typically solved with a git push --force which simply replaces the existing history.

    origin
    
    A - B - C - D - E
     \
      B1 - C1 - D1 - E1 [master]
    
    local
    
    A - B - C - D - E
     \
      B1 - C1 - D1 - E1 [origin/master][master]
    

    Note that the old commits are still there, both in your local and origin. And they will remain there until garbage collection happens, normally for weeks.

    On local you can see old commits with git-reflog. The remote repository may have a reflog, and if it is you're in luck. But it may not. Yet the old commits are still there.

    You can see them in the objects directory. Commit 53715b45364b160ce6e36d934be903459aebe254 is in the file objects/53/715b45364b160ce6e36d934be903459aebe254... though it may be packed for efficiency.

    So the commit is likely still there, but how can you find evidence of rebasing? One trick is to run garbage collection and see what's left. git gc will only pack referenced commits, and it will save unreferenced commits younger than two weeks old (this can be configured with the various "prune" config options). Any commits left in objects are likely to be old commits.

    $ find objects -type f
    objects/3b/d1f0e29744a1f32b08d5650e62e2e62afb177c
    objects/5f/941d2600a72d0126689ee1d51225cb7d3c0a05
    objects/33/f273a8ea4378453f297dc9a69e81a374534361
    objects/02/5f398bbc061f1fa0b61f8454f5d134f43a8a4c
    objects/d7/3afbb06ed2245a0c82ab147ea28eabeb41196e
    objects/b4/036885119a0c0a726d4a07c39880f5d4e5d688
    objects/7d/311b920e2b6ba8ac14198c1a30fe41af8e34fe
    objects/42/e2ac33f56ca2b7f51c87db4baeb66980525c40
    objects/19/6803e246079172e91973f583672290988e89c9
    objects/81/7b794bceb50878e569b18a336e52181c3bf45c
    objects/86/e041dad66a19b9518b83b78865015f62662f75
    objects/9a/095aeaea32de8059face7ff2ccea63a2337dbb
    objects/09/1dddfa09138dc26cbf0dec4603a4aa95001d1c
    objects/53/715b45364b160ce6e36d934be903459aebe254
    objects/0a/f001a378ba6af70af76c10466b90c8f96e8403
    objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    objects/84/d0a65b5b8b4d9376f08e3c4518057b16ee6082
    objects/25/7cc5642cb1a054f08cc83f2d943e56fd3ebe99
    

    The old commits are 53715b45364b160ce6e36d934be903459aebe254, b4036885119a0c0a726d4a07c39880f5d4e5d688, and 025f398bbc061f1fa0b61f8454f5d134f43a8a4c. After running git gc...

    $ git gc
    Enumerating objects: 15, done.
    Counting objects: 100% (15/15), done.
    Delta compression using up to 8 threads
    Compressing objects: 100% (5/5), done.
    Writing objects: 100% (15/15), done.
    Building bitmaps: 100% (5/5), done.
    Total 15 (delta 0), reused 0 (delta 0)
    Computing commit graph generation numbers: 100% (5/5), done.
    
    $ find objects -type f
    objects/02/5f398bbc061f1fa0b61f8454f5d134f43a8a4c
    objects/b4/036885119a0c0a726d4a07c39880f5d4e5d688
    objects/pack/pack-57c98c590b41d20da1b6892ce4a18ab735272583.pack
    objects/pack/pack-57c98c590b41d20da1b6892ce4a18ab735272583.bitmap
    objects/pack/pack-57c98c590b41d20da1b6892ce4a18ab735272583.idx
    objects/info/commit-graph
    objects/info/packs
    objects/53/715b45364b160ce6e36d934be903459aebe254
    

    The unreferenced commits were left unpacked.

    I don't know how foolproof this technique is, but it makes it possible to find evidence of a recent force push.


    The best option to solve this problem is to not have it in the first place.

    If you disallow direct commits to master, requiring all commits to be done via PR, this ensures you have a record of everything that goes into master, that all additions to master are done with clean merges, and that nobody (except admins) can change master's history. In concert with automated checks and tests this ensures master is always in good shape and the team has a solid base to work from.

    In addition, forced push after rebase is normal, but a normal git push --force remains dangerous. Instead use git push --force-with-lease to push after rebase. This will check that only the commits which were changed by rebasing are force pushed. git push --force-with-lease makes rebasing safer. I have it aliased to git repush.