Search code examples
pythonsnakemake

How can I reevaluate params of Snakemake rule in every run?


I have a toy Snakefile:

rule DUMMY:
    output: "foo.txt"
    resources:
        mem_mb=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1)
    params:
        max_mem=lambda wildcards, resources: resources.mem_mb * 1024
    shell:
        """
        echo {resources.mem_mb} {params.max_mem}
        ulimit -v {params.max_mem}
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch {output}
        """

It works as I expect with snakemake v. 6.6.1:

$ snakemake -v
6.6.1
$ snakemake foo.txt -j 1 --restart-times 3
Building DAG of jobs...
[...]
1024 1048576
[...]
Trying to restart job 0.
Select jobs to execute...
[...]
2048 2097152
[...]
4096 4194304
[...]
8192 8388608
[Tue May 17 11:56:42 2022]
Finished job 0.
1 of 1 steps (100%) done
[...]

but fails with v. 7.3.8:

$ snakemake -v
7.3.8
$ snakemake foo.txt -j 1 --restart-times 3
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job      count    min threads    max threads
-----  -------  -------------  -------------
DUMMY        1              1              1
total        1              1              1

Select jobs to execute...

[Tue May 17 12:02:34 2022]
rule DUMMY:
    output: foo.txt
    jobid: 0
    resources: tmpdir=/tmp, mem_mb=1024

1024 1048576
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "[...]/lib/python3.7/site-packages/numpy/core/numeric.py", line 204, in ones
    a = empty(shape, dtype, order)
numpy.core._exceptions.MemoryError: Unable to allocate 7.45 GiB for an array with shape (1000000000,) and data type float64
[Tue May 17 12:02:35 2022]
Error in rule DUMMY:
    jobid: 0
    output: foo.txt
    shell:
        
        echo 1024 1048576
        ulimit -v 1048576
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch foo.txt
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Trying to restart job 0.
[...]
2048 1048576
[...]
4096 1048576
[...]
Trying to restart job 0.
Select jobs to execute...

[Tue May 17 12:02:35 2022]
rule DUMMY:
    output: foo.txt
    jobid: 0
    resources: tmpdir=/tmp, mem_mb=8192

8192 1048576
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "[...]/lib/python3.7/site-packages/numpy/core/numeric.py", line 204, in ones
    a = empty(shape, dtype, order)
numpy.core._exceptions.MemoryError: Unable to allocate 7.45 GiB for an array with shape (1000000000,) and data type float64
[Tue May 17 12:02:35 2022]
Error in rule DUMMY:
    jobid: 0
    output: foo.txt
    shell:
        
        echo 8192 1048576
        ulimit -v 1048576
        python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch foo.txt
      
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-05-17T120234.647036.snakemake.log

The resources.mem_mb is updated every attempt, while params.max_mem gets stuck at 1048576. How can I force params.max_mem to update? Is it a bug or a feature?


Solution

  • The resources directive by default will assume that a given resource is unlimited, so specifying an arbitrary resource and referring to it inside the shell is another solution:

    rule DUMMY:
        output: "foo.txt"
        resources:
            mem_mb=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1),
            max_mem=lambda wildcards, attempt: 1024 * 2 ** (attempt - 1) * 1024,
        shell:
            """
            ulimit -v {resources.max_mem}
            echo {resources.max_mem} $(ulimit -v)
            python3 -c 'import numpy; x=numpy.ones(1_000_000_000)' && touch {output}
            """