I'm building a pipeline with Snakemake. One rule involves an R script that reads a CSV file using readr
. I get this error when I run the pipeline with --use-singularity
and --use-conda
Error: Unknown TZ UTC
In addition: Warning message:
In OlsonNames() : no Olson database found
Execution halted
Google suggests readr
is crashing due to missing tzdata but I can't figure out how to install the tzdata package and make readr
see it. I am running the entire pipeline in a Mambaforge container to ensure reproducibility. Snakemake recommends using Mambaforge over a Miniconda container as it's faster, but I think my error involves Mambaforge as using Miniconda solves the error.
Here's a workflow to reproduce the error:
#Snakefile
singularity: "docker://condaforge/mambaforge"
rule targets:
input:
"out.txt"
rule readr:
input:
"input.csv"
output:
"out.txt"
conda:
"env.yml"
script:
"test.R"
#env.yml
name: env
channels:
- default
- bioconda
- conda-forge
dependencies:
- r-readr
- tzdata
#test.R
library(readr)
fp <- snakemake@input[[1]]
df <- read_csv(fp)
print(df)
write(df$x, "out.txt")
I run the workflow with snakemake --use-conda --use-singularity
. How do I run R scripts when the Snakemake workflow is running from a Mambaforge singularity container?
Looking through the stack of R code leading to the error, I see that it checks a bunch of default locations for the zoneinfo
folder that tzdata
includes, but also checks for a TZDIR
environment variable.
I believe a proper solution to this would be for the Conda tzdata
package to set this variable to point to it. This will require a PR to the Conda Forge package (see repo issue). In the meantime, one could do either of the following as workarounds.
Continuing to use the tzdata
package from Conda, one could set the environment variable at the start of the R script.
#!/usr/bin/env Rscript
## the following assumes active Conda environment with `tzdata` installed
Sys.setenv("TZDIR"=paste0(Sys.getenv("CONDA_PREFIX"), "/share/zoneinfo"))
I would consider this a temporary workaround.
Otherwise, make a new Docker image that includes a system-level tzdata installation. This appears to be a common issue, so following other examples (and keeping things clean), it'd go something like:
Dockerfile
FROM --platform=linux/amd64 condaforge/mambaforge:latest
## include tzdata
RUN apt-get update > /dev/null \
&& DEBIAN_FRONTEND="noninteractive" apt-get install --no-install-recommends -y tzdata > /dev/null \
&& apt-get clean
Upload this to Docker Hub and use it instead of the Mambaforge image as the image for Snakemake. This is probably a more reliable long-term solution, but perhaps not everyone wants to create a Docker Hub account.