Search code examples
slurmsnakemake

Snakemake WorkflowError: Failed to group jobs together


BACKGROUND: I have to adapt my Snakemake pipeline from a single-node usage to a cluster with resource management. With a SLURM-specific Snakemake profile, my rules are successfully submitted as SLURM jobs, so I continued to add the Snakemake directive resources to every non-local rule to optimize queue scheduling. These settings were adopted and my pipeline finished as intended.

EXAMPLE:

rule ruleA:
    group: "group_1_init"
    resources:
        cpus=1,
        time="00:04:00"

rule ruleB:
    group: "group_1_init"
    resources:
        cpus=1,
        time="00:05:00"

ruleA and ruleB are submitted as a single job to a computing node.

PROBLEM: My pipeline has many small, single-CPU jobs that I binned with the Snakemake rule directive group. Here is the error:

WorkflowError:
Failed to group jobs together. Resource time is a string but not all group jobs require the same value. Observed: 00:05:00 != 00:04:00.

I guess, there should be only one resource setting per group but I could not find online resources on the logic behind it.

QUESTION: How do I define my varying resource requirements in group jobs? Should e.g. time reflect the computing time of the whole group of jobs or does Snakemake sum-up rule times within a group as a parameter for the SLURM job submission. For cpus in turn, that would be the max cpus among all rules.


Solution

  • Without parameter conversion by an additional script, the resource time has to be an integer and is interpreted as minutes. This is the correction:

    rule ruleA:
        group: "group_1_init"
        resources:
            cpus=1,
            time=4
    rule ruleB:
        group: "group_1_init"
        resources:
            cpus=1,
            time=5