Search code examples
pythonsnakemake

Specifying Python version within Snakemake using script parameter


I have some code that I am incorporating into a Snakemake pipeline. I also want to be able to run the code independently of Snakemake, so I want to write flexible code. I have a config file that can be read by Snakemake and an alternative config file that will be parsed if the code is run outside Snakemake.

I had the following structure in a rule in my Snakemake file:

rule some_rule:
    input:
        input_file
    output:
        output_file
    script:
        runfile.py

and then in runfile.py I tested to see if the script was being run within Snakemake as follows:

if 'snakemake' in globals():
    # Get config params from Snakemake
else:
    # Load config params from alternative config file 

This was fine, except that I am working within a virtual environment and the Python version I want to use is different from the one Snakemake uses by default. So I had to restructure the rule as follows:

rule some_rule:
    input:
        input_file
    output:
        output_file
    shell:
        /path/to/python runfile.py

but now because I am using shell instead of script I don't have access to the global parameter 'snakemake' in my python file.

So my question is either

(i) Can I specify a Python version and still use script?

or

(ii) Can I use shell but still access the 'snakemake' object within the script that will allow me to test whether or not the script is being run in a Snakemake pipeline?


Solution

  • You could use Conda - it would both provide fine-grained control over execution contexts and improve the pipeline reproducibility. See Documentation.

    First, you'll need an environment definition YAML file. For example,

    envs/py37.yml

    channels:
      - defaults
    dependencies:
      - python=3.7
    

    Add in any other requirements you need in that file. Then your rule would be

    Snakefile

    rule some_rule:
        input:
            input_file
        output:
            output_file
        conda:
            envs/py37.yml
        script:
            runfile.py
    

    Lastly, you now need to use the additional flag --use-conda when launching this job, e.g.,

    shell

    snakemake --use-conda output_file