Search code examples
python-3.xgitgitpython

How to get the latest commit hash on remote using gitpython?


Is there a way that I can get the most recent commit on a remote repository using gitpython? I do not want to perform operations like a pull or merge on my local branch. I also do not want to depend on the master branch on my local to get this information. All I have is a valid repo out there, and I am using repo.remotes.origin.url to get the information.

With just the repo URL, can I get the most recent commit on that repository?


Solution

  • Using gitpython, you can't do this without a local clone. Git is a distributed system, so it's designed for users to operate on their local repos. These answer gives some decent explanations and alternatives:

    Using gitpython - requires local repo

    You can do a shallow clone (for speed), get latest commit SHA using git rev-parse or git ls-remote, then delete the local repo.

    import git
    from pathlib import Path
    
    repo_url = 'https://github.com/path/to/your/repo.git'
    local_repo_dir = Path('/path/to/your/repo')
    
    # delete the repo if it exists, perform shallow clone, get SHA, delete repo
    local_repo_dir.unlink(missing_ok=True)
    repo = git.Repo.clone_from(repo_url, local_repo_dir, depth=1)
    sha = repo.rev_parse('origin/master')
    local_repo_dir.unlink()
    print(sha)
    

    Using python subprocess - does not require local repo

    This simpler solution uses git ls-remote, which does not require a local clone. The following uses subprocess to get the SHA-1 of the given branch from the remote repo without a local clone. Note that the SHA needs to be extracted from the output response by splitting at the first tab.

    import subprocess
    import re
    
    repo_url = 'https://github.com/path/to/your/repo.git'
    process = subprocess.Popen(["git", "ls-remote", repo_url], stdout=subprocess.PIPE)
    stdout, stderr = process.communicate()
    sha = re.split(r'\t+', stdout.decode('ascii'))[0]
    print(sha)