Search code examples
gitvisual-studio-2017tfs-2015git-commitgit-history

Edit another developers GIT commit message on TFS


Iv been asked can i modify a set of checked in comments in the git history for a project that contain in-appropriate wording by another developer.

Can someone show me how to achieve this (or confirm if this is not possible)? I'm Administrator on TFS so access rights is not an issue. Our TFS is on premise using SQL Express & GIT as the Source control. TFS version is 15.117.26714.0

We use Visual Studio 2017 on our local machines with Team Explorer plugin.


Solution

  • It can be done, but it will likely be a hassle. Depending on the nature of the audit rule, you might consider whether git notes offer a better solution. (If you are just missing information, and if your auditors will go along with it, adding notes will let you put the additional info in with minimal disruption. If you need to remove information - if someone was putting passwords, or organizationally-unacceptable language, or whatever into the messages - then that won't help.)

    You might note that some other source control systems let you edit commit messages more freely. I wouldn't want to make the decision of source control tool for my team on that basis, but it is a consideration if this is a frequent requirement for your team. Or, better yet would be to figure out if anything needs to be adjusted so that this can not be a frequent requirement going forward.

    (To some extent that may be a training issue about what is required and/or prohibited in commit messages. You might be able to set up some kind of guard rail using hooks - a pre-receive hook being the most important if your hosting environment will let you configure one, and pre-commit hooks being a nice convenience for the devs to make sure any mistakes "fail fast".)

    The problem is that the commit message is an integral part of a commit. When you modify the message of a commit (let's call it P), you're really creating a new commit (P'). Also, if you have a commit C and you want to change its parent from P to P', then you have to replace it with a new commit C'.

    The difference between modifying commits vs. replacing them with new commits doesn't sound like much, but in practice it means that you're removing the original commits from the histories of published branches, and that's where it becomes a problem.

    To do the clean-up, start with any clone. Suppose we have the following, where commits not named 'x' have bad commit messages.

    x -- x -- x -- A <--(master)(origin/master)
     \
      x -- x - x -- B -- x <--(branch1)(origin/branch1)
       \      /
        x -- C -- x <--(branch2)(origin/branch2)
    

    Fixing A isn't too hard.

    git checkout master
    git commit --amend
    

    will present an editor and you can rewrite the commit message. (Or you can include git commit options such as -m to specify the commit message.) Then you have

                A' <--(master)
               /
    x -- x -- x -- A <--(origin/master)
     \
      x -- x - x -- B -- x <--(branch1)(origin/branch1)
       \      /
        x -- C -- x <--(branch2)(origin/branch2)
    

    Note that A still exists and origin/master still points to it. This is important. It means that when you want to update origin you'll have to say

    git push -f
    

    and this will "break" everyone else's clone. See the git rebase docs (https://git-scm.com/docs/git-rebase) under "Recovering from Upstream Rebase" for more information about the problem and typical clean-up procedure; but also note that if this turns out to rewrite a large part of the repo, you might instead want to follow a cut-over procedure like this:

    • Everyone pushes all of their code; it need not be fully merged but must be in origin, because
    • Everyone discards their clones
    • You do your clean-up work
    • Everyone makes a new clone

    Anyway, you've got one commit cleaned up. On to B. This time it's not a branch tip, so amend isn't the thing to use. Instead do a rebase.

    git rebase -i branch1~2 branch1
    

    You'll get a "todo" list that shows each of the last couple commits on branch1. Find the entry for B and change the start of the line from pick to reword. Then let the rebase proceed, and when it reaches B it will give you an editor for the commit message. In the end you have

                A' <--(master)
               /
    x -- x -- x -- A <--(origin/master)
     \
      x --- x --- x -- B -- x <--(origin/branch1)
       \         / \
        \       /   B' -- x' <--(branch1)
         \     /
          x - C -- x <--(branch2)(origin/branch2)
    

    Not too much worse than A; you rewrote an x commit due to reparenting, but in the end it's still just one more ref to force-push.

    Now what about C? Well, C is reachable from multiple refs, and for one of those commits there's a merge in between. Those factors make rebase much harder to use correctly. In this case you might want to use git filter-branch. (If you have any cases that make you use filter-branch, then you might consider just doing a single filter-branch to do all of your rewriting, instead of messing with individual rebase or amend operations.)

    The trouble with this is how to write the msg-filter. You could write a script that checks the commit ID, for each known "bad" commit outputs the corresponding new commit message, and for everything else just cats back the original message. Or you could just fire up an editor for every commit, if there aren't too many commits for that to be practical. Or something in between (fire up an editor for every commit whose ID is on a certain list). There are too many approaches, and too many reasons that would influence which approach to use, to fully detail in this answer.

    Assuming you sort out a filter script, you would do something like

    git filter-branch --msg-filter=my-filter-script -- --all
    

    The full diagram for this result gets messy, but basically for the local branches C is replaced with C', everything that can reach C is similarly replaced, and each branch from which C is reachable needs to be force pushed.