Search code examples
gitgogo-git

How to checkout a specific single file to inspect it using go-git?


I want to clone a specific repository, fetch all tags and iterate through them. For each tag I want to checkout a specific file ( package.json ) in the root directory. If no file is present, it should continue, otherwise it should pass it over so I can inspect it.

I started with the following code ( my first Go application ... )

package main

import (
    "fmt"
    "github.com/go-git/go-billy/v5"
    "github.com/go-git/go-billy/v5/memfs"
    "github.com/go-git/go-git/v5"
    "github.com/go-git/go-git/v5/plumbing"
    "github.com/go-git/go-git/v5/plumbing/transport/http"
    "github.com/go-git/go-git/v5/storage/memory"
    "os"
)

func main() {
    authentication := &http.BasicAuth{
        Username: "me",
        Password: "my-key",
    }
    repositoryUrl := "my-repo-url"
    inMemoryStorage := memory.NewStorage()
    inMemoryFilesystem := memfs.New()

    repository, err := cloneRepository(repositoryUrl, authentication, inMemoryStorage, inMemoryFilesystem)

    if err != nil {
        handleError(err)
    }

    tagsIterator, err := repository.Tags()

    if err != nil {
        handleError(err)
    }

    err = tagsIterator.ForEach(func(tag *plumbing.Reference) error {
        fmt.Println(tag.Name().Short()) // for debugging purposes

        // checkout package.json file ( at root ) via tag

        return nil
    })

    if err != nil {
        handleError(err)
    }
}

func cloneRepository(repositoryUrl string, authentication *http.BasicAuth, inMemoryStorage *memory.Storage, inMemoryFilesystem billy.Filesystem) (*git.Repository, error) {
    return git.Clone(inMemoryStorage, inMemoryFilesystem, &git.CloneOptions{
        URL:  repositoryUrl,
        Auth: authentication,
    })
}

func handleError(err error) {
    fmt.Println(err)
    os.Exit(1)
}

Does someone know how to try checking out the file inside the loop by a given tag?


Solution

  • You don't need to "check out" anything if all you want is the file content; you can extract that directly from the repository. But first, caveats: I am neither an experienced Go programmer, nor have I ever worked with go-git before, so there may be a more optimal way of doing this.

    Starting with a tag, you can:

    1. Get the commit to which the tag points
    2. Get the tree to which the commit points
    3. Iterate through the tree looking for package.json
    4. If you find it, extract the corresponding blob. Now you have your content!

    The above steps might look something like this:

    func getFileFromRef(repository *git.Repository, ref *plumbing.Hash, filename string) (bool, []byte, error) {
        // Get the commit object corresponding to ref
        commit, err := repository.CommitObject(*ref)
        if err != nil {
            return false, nil, err
        }
    
        // Get the tree object from the commit
        tree, err := repository.TreeObject(commit.TreeHash)
        if err != nil {
            return false, nil, err
        }
    
        // Iterate through tree entries
        for _, entry := range tree.Entries {
            // If we find the target file...
            if entry.Name == filename {
                // Get the blob object from the repository
                blob, err := repository.BlobObject(entry.Hash)
                if err != nil {
                    return false, nil, err
                }
    
                // Ask for a Reader
                reader, err := blob.Reader()
                if err != nil {
                    return false, nil, err
                }
    
                // Allocate a slice for the data...
                data := make([]byte, blob.Size)
    
                // ...and read it in.
                n, err := reader.Read(data)
                if err != nil {
                    return false, nil, err
                }
    
                // Double check that we read as many bytes as
                // we expected
                if int64(n) != blob.Size {
                    return true, nil, fmt.Errorf("wrong size")
                }
                return true, data, nil
            }
        }
    
        return false, nil, nil
    }
    

    The above function will, given a commit reference, look for filename in the top level of the repository (as written it does not traverse subdirectories). You would need to modify the tagsIterator.ForEach loop in your main function to do something like this:

        err = tagsIterator.ForEach(func(tag *plumbing.Reference) error {
            // Get the commit to which the tag refers. We need this to
            // resolve annotated tags.
            ref, err := repository.ResolveRevision(plumbing.Revision(tag.Hash().String()))
            if err != nil {
                handleError(err)
            }
    
            found, content, err := getFileFromRef(repository, ref, "package.json")
            if found && err == nil {
                fmt.Printf("found \"package.json\" in tag %s\n", tag.Name().Short())
                fmt.Println(string(content))
            }
    
            return nil
        })
    

    I don't know if this is what you were looking for, but it was interesting learning about the go-git package.