Search code examples
rpackageopensuse

R "Packages" vs Linux packages for R


I'm a little confused about what is going on when I install packages within R.

I start R (from the terminal, via "R"), and then at the R prompt, I do this:

> install.packages("devtools")

R goes off, and downloads quite a few things, launches a C/C++ compiler against dozens of source files, appears to build stuff successfully, but eventually fails with a message like this:

ERROR: dependencies ‘usethis’, ‘pkgdown’, ‘rcmdcheck’, ‘rversions’, ‘urlchecker’ are not available for package ‘devtools’

This is quite confusing, because I expect R to handle the dependencies.

Then I wonder, have I failed to install R correctly ? Looking at the list of R packages available (for OpenSuse Tumbleweed via zypper), I can see that there are about 50 packages with names like "R-base-devel", "R-graphics", "R-tools", etc.. These are somehow distinct from R's internal conception of a package. It's totally unclear how these Linux packages are related to the R internal packages, and if perhaps I am missing one of them.

I installed R via zypper by way of "sudo zypper install R-recommended-packages", which of course pulled in quite a few dependencies, so I guess I have a valid environment. But, clearly, I am missing something

How does this all hang together ? How do I know which linux package for R contains which R package ?

I wasn't expecting this to be quite a confusing and complicated.


Solution

  • All R packages are available in source format. That source code can be just plain R, or include stuff in other languages, such as C, C++, or Fortran. While R source code does not need to be pre-compiled to run, stuff in other languages often does. This means that when a package containing C or C++ source code is installed, that code needs to be compiled into binaries. This means that any environment installing R packages from source needs to have the appropriate toolset (compiler, linker, etc) and libraries including dependencies such as Boost. To avoid this, and to speedup the installing process, R packages are also made available in binary form, for some platforms (Windows and macOS being two notable cases).

    Linux distributions run on multiple architectures and aren't consistent in how they make their libraries available, so it is not practical to distribute binary R packages (except perhaps for some well known and widely used distributions, and this happens with a lot of software, not just R packages). So in most cases, R packages must be installed from source, and compiled upon installation. To ease things, some Linux distributions pack sets of R packages in their own package system, so that's why you find things like R-graphics and R-recommended-packages on the repository of your OS. And you are absolutely right in your understanding that these OS packages are not the same thing as R packages.

    R packages can be found either on centralized repositories, such as CRAN (the official repository for R packages), or Bioconductor, as standalone files. Nowadays you can also just pull a copy of a git repository and build the package from there. This is not to be confused with the repository for OS packages.

    When you issue the command install.packages("foo") in R, it will search R repositories for package foo. If it's available in binary form for your platform and OS, it will install that. Otherwise, it will download the source code, compile it if necessary, and install it. By default, R will search the CRAN repository indicated by options('repos'), but you can add others.

    Now if the package you are compiling has some external dependencies (such as libxml or libcurl), you'll need to make them available on your system, as @MichaelChirico noted on his comments. So you'll need to find you the requirements of those packages and install via your OS's package management system (e.g. zypper). You will need the development versions for each library (in Linux they often are -dev or -devel packages). This is another advantage of the OS distributing it own sets of pre-built R packages, such as R-recommended-packages: their dependencies are automatically installed by the OS's package management system.

    The errors you are seeing are probably due to some of the R packages being installed (perhaps dependencies of the R packages you want) failing to compile, and thus to be installed. You'll need to find out their external requirements and install them first.