Search code examples
gitgit-blame

Use the same commit abbreviation length for git blame as other commands


The git blame command shows commit hashes abbreviated to a length that is one character longer than other commands. For example:

$ git log --oneline       
9fb6f706 (HEAD -> master) second commit
0af747a8 first commit
$ git blame foo   
9fb6f7064 (gilles 2020-11-15 12:28:09 +0100 1) revised
^0af747a8 (gilles 2020-11-15 12:27:41 +0100 2) world

I frequently copy-paste an abbreviated hash from the blame output and search for it in logs or in the set of commits in an interactive rebase. But because the abbreviation is one character longer in the git blame output, I have to remember to delete the last character, otherwise the search can't find anything.

For scripting I'd use unabbreviated hashes and porcelain formats. But for interactive use, I want to use abbreviated hashes.

Setting the core.abbrev option doesn't help: git blame adds one to that. Setting core.abbrev and calling blame --abbrev with a value that's one less works, but is not a good solution because I lose the benefit of git's heuristics to determine a good length for short commit ids, and I have to pass this option explicitly or use a different command name as an alias.

How can I make a plain git blame use the same length for abbreviated commit ids as other git commands?


Solution

  • I ended up writing a custom Git command for this. Save the code below in an executable file called git-blame-1 somewhere on your $PATH. Now you can run git blame-1 instead of git blame, and the commited ID abbreviations will have the same length as with other commands such as git rebase.

    #!/usr/bin/env bash
    
    set -o pipefail
    
    # The blame command adds one hex digit to abbreviated commit IDs to ensure that
    # boundary commits are unambiguous (boundary commits have leading ^ and then
    # just enough hex digits to be unambiguous, while other commits have an
    # extra hex digit so that they're the same length as boundary commits).
    # This alias removes the extra digit, so the abbreviated IDs are the same
    # length as other commands such as log and rebase -i. With low probability,
    # this can make the abbreviation of boundary commit IDs ambiguous.
    
    # Implementation note: we use a sed filter to detect lines starting with an
    # optional escape sequence for coloring, then an optional ^, then hex digits
    # and a space. The sed filter here removes the last hex digit.
    
    # Note that $sed_filter does not contain any single quote, so it can be
    # included in a shell snippet as "...'$sed_filter'...".
    sed_filter='s/^\(\(0-9;]*m\)*\^*[0-9a-f]*\)[0-9a-f] /\1 /'
    
    if [ -t 1 ]; then
      GIT_PAGER="sed '$sed_filter' | ${GIT_PAGER:-${PAGER:-less}}" git blame "$@"
    else
      git blame "$@" | sed "$sed_filter"
    fi
    

    The script takes care of preserving paging when running on a terminal. This has to be a separate script, not an alias, because shell aliases run from the root directory, so they don't work correctly from a subdirectory.

    I also define an alias to save typing. In my .gitconfig:

    b1 = blame-1     # blame with standard abbreviated commit IDs