Search code examples
pythongitgitpython

GitPython: git.diff(commit_a, commit_b) always returns empty string


When I try the following code using GitPython:

repo.head.commit.diff('HEAD~1')[0].diff

It always returns an empty string. I've changed the file many times I tried in different commits as well.

I've also tried the following code that would list all the changed files between the first and the last commits.

changed_files = []

for x in commits_list[0].diff(commits_list[-1]):
    if x.a_blob.path not in changed_files:
        changed_files.append(x.a_blob.path)

    if x.b_blob is not None and x.b_blob.path not in changed_files:
        changed_files.append(x.b_blob.path)

print changed_files

Solution

  • From the docs (pydoc git.Commit).

     |  Methods inherited from git.diff.Diffable:
     |  
     |  diff(self, other=<class 'git.diff.Index'>, paths=None, create_patch=False, **kwargs)
     [...]
     |      :param create_patch:
     |              If True, the returned Diff contains a detailed patch that if applied
     |              makes the self to other. Patches are somwhat costly as blobs have to be read
     |              and diffed.
    

    So if we replicate your code, we get an empty diff attribute:

    >>> import git
    >>> r = git.Repo('.')
    >>> c1 = r.head.commit
    >>> c2 = r.commit('HEAD~1')
    >>> print c1.diff(c2)[0].diff
    

    But if we set create_patch to True:

    >>> print c1.diff(c2, create_patch=True)[0].diff
    
    --- a/nova/compute/manager.py
    +++ b/nova/compute/manager.py
    @@ -4593,19 +4593,11 @@ class ComputeManager(manager.Manager):
                     LOG.debug("Updating volume usage cache with totals",
                               instance=instance)
    [...]