Search code examples
pythongitclonegitpython

GitPython sparse checkout and check for deviations


After an unsuccessful read of GitPython's documentation, I thought I'd raise my question on here.

I'm working in Python 3.10 and would like to clone a specific folder within a repository, specifically, the yml subfolder. I do not require the entire repo.

https://github.com/LOLBAS-Project/LOLBAS/tree/master/yml

Once initially cloned, I'd like to check whether the subfolder has had any updates and if so, I'd like to pull them to the yml folder.

As of now, I have a function that clones the entirety of the repo into a local directory.

import git
def repoCheck():
    try:
        git.Repo.clone_from('https://github.com/LOLBAS-Project/LOLBAS', 'LOLBAS')
        
    except git.GitCommandError as exception:
        print(exception)

This leaves me with (example):

C:\Users\ExampleUser\Documents\Lolbas

Lolbas/
├─ Logos/
├─ yml/
│  ├─ a.yml
│  ├─ b.yml
│  ├─ x.yml
├─ Archive-Old-Version/
│  ├─ x.yml
│  ├─ b.yml
├─ .gitignore
├─ package.json
├─ README.md

But I'd simply like a subfolder extract:

Lolbas/
├─ yml/
│  ├─ a.yml
│  ├─ b.yml
│  ├─ x.yml

Is initially cloning just this subfolder then making a pull request to check whether this specific subfolder is up-to-date possible?

Thank you for any help and guidance with this. I don't have much of a solution as I'm not overly familiar with Git and couldn't locate much information on GitPython docs.


Solution

  • This is how I was able to pull a specific directory from a git repo:

    from git import Repo
    repo = Repo.init("path/to/local/repo")
    
    # Create a new remote if there isn't one already created
    origin = repo.remotes[0]
    if not origin.exists():
        origin = repo.create_remote("origin", "https://github.com/LOLBAS-Project/LOLBAS")
    
    origin.fetch()
    git = repo.git()
    git.checkout("origin/master", "--", "yml")
    

    As for pulling any new updates, I suggest just removing the yml directory entirely before running the above code. There are probably better ways of doing this, but I find this to be the most straightforward.