Search code examples
rpackrat

Using R with git and packrat


I have been using git for a while but just recently started using packrat. I would like my repository to be self contained but at the same time I do not want to include CRAN packages as they are available. It seems once R is opened in a project with packrat it will try to use packages from project library; if they are not available then it will try to install from src in the project library; if they are not available it will look at libraries installed in that computer. If a library is not available in the computer; would it look at CRAN next?

What files should I include in my git repo as a minimum (e.g., packrat.lock)?


Solution

  • You can choose to set an external CRAN-like repository with the source tarballs of the packages and their versions that you'd like available for your project. The default behaviour, however, is to look to CRAN next, as you've identified in your question. Check out the packrat.lock file, you will see that for each package you use in packrat, there is an option called source: CRAN (if you've downloaded the file from CRAN, that is).

    When you have a locally stored package source file, the contents of the lockout for said package change to the following:

    Package: FooPackage Source: source Version: 0.4-4 Hash: 44foo4036fb68e9foo9027048d28 SourcePath: /Users/MyName/Documents/code/myrepo/RNetica

    I'm a bit unclear on your final question: What files should I include in my git repo as a minimum (e.g., packrat.lock)? But I'm going to take this as a) combination of what files should be present for packrat to run, and b) which of those files should be committed to the git-repo. To answer the first question, I illustrate with initialising packrat on an existing R project.

    When you run packrat::init(), two important things happen (among others): 1. All the packrat scaffolding, including source tarballs etc are created under: PackageName/packrat/. 2. packrat/lib*/ is added to your .gitignore file.

    So from this, we can see that anything under packrat/lib*/ doesn't need to be committed to your git-repo. This leaves the following 3 files to be committed:

    1. packrat/init.R
    2. packrat/packrat.lock
    3. packrat/packrat.opts

    packrat.lock is needed for collaborating with others through a version control system; it helps keep your private libraries in sync. packrat.opts allows you to specify different project specific options for packrat. The file is automatically generated using get_opts and set_opts. Committing this file to the git-repo will ensure that any options you specify are maintained for all collaborators. A final file to be committed to the repo is .Rprofile. This file tells R to use the private package library (when R is started from the project directory).

    Depending on your needs, you can choose to commit the source tar balls to the repository, or not. If you don't want them available in your git-repo, you simply add packrat/src/ to the .gitignore. But, this will mean that anyone accessing the git-repo will not have access to the package source code, and the files will be downloaded from CRAN, or from wherever the source line dictates within the packrat.lock file.

    From your question, it sounds like committing the packrat/src/ folder contents to your repo might be what you need.