I am looking for advice on testing input functions in Snakemake. Input functions are commonly defined in the common.smk
file, e.g.:
import pandas as pd
from snakemake.io import glob_wildcards
def find_fastq_files(wildcards) -> list[str]:
"""
Return a list of fastq files given sample name and processing stage.
"""
# Find the fastq file names for the sample
sample_dir = f"data/{wildcards.sample}/fastq"
file_names = glob_wildcards(f"{sample_dir}/{{file_name}}.fastq.gz").file_name
# return the file paths
return [f"{sample_dir}/{f}.fastq.gz" for f in file_names]
My current approach involves parsing the output of a snakemake dry run, followed by assertion using pytest
. It would be cleaner to circumvent the dry run and test the input function directly, however I have not found an easy way to import a function that is defined within common.smk
.
What would be the recommended way to test such an input function? Thanks!
You should be able to move this function to a file with a .py
extension - eg. funcs.py
, then in common.smk
you would do from funcs import find_fastq_files
.
And now you can also import those functions into your unit tests as you would any regular Python code. In the example above your common.smk
is valid Python but as things get more complex it's often good to split out the Snakemake stuff from the vanilla Python functions.
Failing this, it's possible to get Snakemake to parse the entire workflow file and give you the functions without doing a dry run. Something like:
from snakemake import Workflow
wf = Workflow(snakefile="workflow/Snakefile")
wf.config.update(dict()) # If you have mandatory config
wf.include(wf.main_snakefile)
# Extract the functions
myfunc1 = wf.globals['myfunc1']
# Now you can test myfunc1()
assert myfunc1(123) == 321
Hope that helps!