Search code examples
gitcygwinmingwbenchmarkinggit-difftool

Git difftool ridiculously slow in Cygwin/MinGW


I noticed that git difftool is very slow. An delay of about 1..2 seconds appears between each diff invocation.

To benchmark it I have written a custom difftool command:

#!/bin/sh
echo $0 $1 $2

And configured Git to use this tool in my ~/.gitconfig

[diff]
    tool = mydiff
[difftool "mydiff"]
    prompt = false
    cmd = "~/mydiff \"$LOCAL\" \"$REMOTE\""

I tested it on the Git sources:

$ git clone https://github.com/git/git.git
$ cd git
$ git rev-parse HEAD
1bc8feaa7cc752fe3b902ccf83ae9332e40921db
$ git diff head~10 --stat --name-only | wc -l
23

When I time a git difftool with 259b5e6d33, the result is ridiculously slow:

$ time git difftool 259b5
mydiff /dev/null Documentation/RelNotes/2.6.3.txt
...
mydiff /tmp/mY2T6l_upload-pack.c upload-pack.c

real    0m10.381s
user    0m1.997s
sys     0m6.667s

By trying a simpler script it goes much faster:

$ time git diff --name-only --stat 259b5 | xargs -n1 -I{} sh -c 'git show 259b5:{} > {}.tmp && ~/mydiff {} {}.tmp'
mydiff Documentation/RelNotes/2.6.3.txt Documentation/RelNotes/2.6.3.txt.tmp
mydiff upload-pack.c upload-pack.c.tmp

real    0m1.149s
user    0m0.472s
sys     0m0.821s

What did I miss?

Here the results I got

| Cygwin | Debian | Ubuntu | Method   |
| ------ | ------ | ------ | -------- |
| 10.381 |  2.620 | 0.580  | difftool |
|  1.149 |  0.567 | 0.210  | custom   |

For the Cygwin results, I measured 2.8s spent in git-difftool and 7.5s spent in git-difftool--helper. The latter is 98 lines long. I don't understand why it is that slow.


Solution

  • Using some of the techniques found on the msysgit GitHub, I have narrowed this down a bit.

    For each file in the diff, git-difftool--helper re-runs the following internal commands:

    12:44:46.941239 git.c:351               trace: built-in: git 'config' 'diff.tool'
    12:44:47.359239 git.c:351               trace: built-in: git 'config' 'difftool.bc.cmd'
    12:44:47.933239 git.c:351               trace: built-in: git 'config' '--bool' 'mergetool.prompt'
    12:44:48.797239 git.c:351               trace: built-in: git 'config' '--bool' 'difftool.prompt'
    12:44:49.696239 git.c:351               trace: built-in: git 'config' 'difftool.bc.cmd'
    12:44:50.135239 git.c:351               trace: built-in: git 'config' 'difftool.bc.path'
    12:44:50.422239 git.c:351               trace: built-in: git 'config' 'mergetool.bc.path'
    12:44:51.060239 git.c:351               trace: built-in: git 'config' 'difftool.bc.cmd'
    12:44:51.452239 git.c:351               trace: built-in: git 'config' 'difftool.bc.cmd'
    

    Notice that, in this particular case, it took roughly 4.5 seconds to execute these. This is a pretty consistent pattern throughout my log.

    Note too that some of these are duplicate - git config difftool.bc.cmd is called 4 times!

    Now, possible remedies:

    • I cut the execution time for these commands in half by moving all of the diff-related sections to the top of my .gitconfig file. Seriously. It's still noticeable, but now on the order of 2 seconds instead of 4.5.
    • Make sure that your Git folder under Program Files and your user profile (where .gitconfig lives) are both excluded from realtime virus scanning.
    • Fundamentally, Git needs to be more efficient with parsing and getting configuration values. Ideally, it would cache these instead of re-requesting (and reparsing...) from config every time in a loop. Perhaps even cached for the entire command execution.