I am encountering an error when executing a Snakemake (6.0.0) workflow such that two jobs are launched simultaneously, on the same node, both of which use the same conda environment. Minimal example is below.
A few observations:
snakemake
from within an interactive slurm job with >1 cpu; I am using Miniconda, provided as a module
by my cluster sysadmins)snakemake --use-conda -j1
). The problem only occurs when -j2
or higher (not exceeding the number of cores available in the slurm allocation). The first job seems to run just fine, it's always the second job that croaks.conda activate /long_path_to_cluster_project_folder/testing/conda_test/.snakemake/conda/c4751dca
works, I can run R from within, etc.)snakemake --use-conda -j2
, the only error I get is (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
below the shell commands run. If I add --verbose
, a lengthy traceback is printed in blue and red, which I've included below. The relevant bit seems to be:
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 505, in prefix_path
return self.info["conda_prefix"]
AttributeError: 'Conda' object has no attribute 'info'
--max-jobs-per-second=0.5
to throttle the jobs so they wouldn't start at the same time, but that appeared to have no effect (jobs start at same time, same error as before. I am not running snakemake with --cluster
or --profile
or anything; there are no extra slurm jobs being created, just processes spawned on the same compute node)I'm pretty new to both snakemake and to HPC, but seems like this is somewhere between a system-/configuration-specific problem (since it only occurs on the cluster) and a minor snakemake bug (since snakemake seems to be attributing the problem to my shell script rather than something to do with conda). I'm interested in suggestions for how to troubleshoot further or to work around the problem.
Thanks!
Minimal example:
├── input.txt
├── results
└── workflow
├── Snakefile
└── envs
└── env1.yaml
workflow/Snakefile
:
rule all:
input:
'results/output1.txt',
'results/output2.txt',
'results/output3.txt',
'results/output4.txt'
rule rule1:
input: 'input.txt'
output:
'results/output{n}.txt'
conda: 'envs/env1.yaml'
shell:"""
sleep 5s
touch {output}
"""
workflow/envs/env1.yaml
:
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- r-ggplot2
$ snakemake --use-conda -j2 -p --verbose
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 2
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
4 rule1
5
<< snip >>
[Fri Mar 5 21:01:33 2021]
Error in rule rule1:
jobid: 2
output: results/output2.txt
conda-env: /long_path_to_cluster_project_folder/testing/conda_test/.snakemake/conda/c4751dca
shell:
sleep 5s
touch results/output2.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Full Traceback (most recent call last):
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2326, in run_wrapper
run(
File "/long_path_to_cluster_project_folder/testing/conda_test/workflow/Snakefile", line 33, in __rule_rule1
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/shell.py", line 141, in __new__
cmd = Conda(container_img).shellcmd(conda_env, cmd)
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 512, in shellcmd
activate = os.path.join(self.bin_path(), "activate")
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 508, in bin_path
return os.path.join(self.prefix_path(), "bin")
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 505, in prefix_path
return self.info["conda_prefix"]
AttributeError: 'Conda' object has no attribute 'info'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 568, in _callback
raise ex
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run
run_func(*args)
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2357, in run_wrapper
raise RuleException(
snakemake.exceptions.RuleException: AttributeError in line 13 of /long_path_to_cluster_project_folder/testing/conda_test/workflow/Snakefile:
'Conda' object has no attribute 'info'
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2326, in run_wrapper
File "/long_path_to_cluster_project_folder/testing/conda_test/workflow/Snakefile", line 13, in __rule_rule1
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 512, in shellcmd
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 508, in bin_path
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 505, in prefix_path
RuleException:
AttributeError in line 13 of /long_path_to_cluster_project_folder/testing/conda_test/workflow/Snakefile:
'Conda' object has no attribute 'info'
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2326, in run_wrapper
File "/long_path_to_cluster_project_folder/testing/conda_test/workflow/Snakefile", line 13, in __rule_rule1
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 512, in shellcmd
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 508, in bin_path
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/deployment/conda.py", line 505, in prefix_path
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 568, in _callback
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/concurrent/futures/thread.py", line 52, in run
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run
File "/long_path_to_cluster_project_folder/conda_envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2357, in run_wrapper
Try updating to at least Snakemake v6.0.2. The problem appears to be a bug with the v6.0.0 release and is patched with version v6.0.2 (Release Notes). You're right on the money with it being a race condition issue (see commit).