Search code examples
pythongitpython-3.7gitpython

Reading progress of Git operation using GitPython stuck


I'm trying to access the progress of time consuming Git operations using GitPython. I tried the sample solution taken from the official documentation, and also tried passing in a method following the exact signature of the update method below. Everytime I call fetch(), push(), pull() with the parameter progress=<anything>, the programm is stuck and the update method does not get called. If I call those operations without setting the progress parameter, it works flawlessly.

  • I use asserts to assure my repo objects are available and in the expected state
  • ProgressPrinter() yields not None
  • I tried calling the functions from the main thread and multithreaded
  • I took a look at the implementation (line 350) of RemoteProgress and also the implementation (line 815) of push() and do not see a reason, why it would not continue execution

  • I found out, that when I assign my ProgressPrinter instance and pass the assigned variable, the programm is not stuck anymore. Yet the update() method does not get called and no progress is printed

# Not stuck anymore, yet no progress
pp = ProgressPrinter()
fetch_info = origin.fetch(progress=pp)

Core of my implementation:

from git import RemoteProgress

class ProgressPrinter(RemoteProgress):
    def update(self,
               op_code,
               cur_count,
               max_count=None,
               message=''):
        print("Is this even called?")

And later on:

origin = repo.remotes.origin
assert origin.exists()
fetch_info = origin.fetch(progress=ProgressPrinter())

Any recommendations on how to investigate this problem furthermore? I've been debugging this for one day now and feel like I am missing something.


Solution

  • 1. GitPython getting stuck when processing the progress information

    Over the last couple of months localization changes have been made to Git. Downgrading my Git from 2.21.0 to 2.20.4 solved this problem for now. Not an elegant solution, but the developers of GitPython know about the changes.

    Take a peek at the issue and see if it's solved: (Github Issue #871)

    2. RemoteProgress's update() method not being called

    If your Git process finishes quickly, this method won't be called at all. To simulate a longer process I suggest you clone a large repository from Github / Gitlab and prepare it in the following way:

    $ git reset --hard @~100
    $ git remote remove origin
    $ git reflog expire --expire=now --all && git gc --prune=now --aggressive
    $ git remote add origin <url>
    

    I proposed to change the update() method in a way, that it is also called once when the operation terminates with = [up to date]: PullRequest