Search code examples
lambdabioinformaticssnakemake

snakemake cant get lambda return in params


I have a snakemake params like:

def get_papreDE(config, super_group): 
    gtf_config = []
    for sample_name in sample.keys(): 
        group_data=sample[sample_name][0]
        super_group_data=sample[sample_name][3]
        if super_group == super_group_data or super_group == 'all':
            stringtie_gtf = config['output']['stringtie_gtf'].format(sample=sample_name, output=output, group=group_data)
            gtf_config.append("\t".join([sample_name, stringtie_gtf]))
    return gtf_config

rule prepDE:
    input: 
        gtf = lambda wildcards: get_stringtie_gtf(config, wildcards.group)
    output:
        counts = output_data['prepDE_counts']
    params: 
        input_gtf = lambda wildcards: get_papreDE(config, wildcards.group), 
    threads: 9
    log: 
        log = 'log/' + output_data['prepDE_counts'] + '.log'
    shell: 
        """
            prepDE.py -i <(echo "{params.input_gtf}" | sed 's/ /\\n/g') -o {output.counts} > {log} 2>&1
        """

I want this script to generate input table for prepDE.py, but the {params.input_gtf} in shell always null, the finally shell to run is like:

prepDE.py -i <(echo "" | sed 's/ /\\n/g') -o /path/to/output.csv > /path/to/log 2>&1

I have checked the function is work well by directly print gtf_config before function return value


Solution

  • It could be because params is expecting a string, not a list. Try to change your return to return ' '.join(gtf_config) (or '\n'.join and skip the sed process). It would also be prudent to raise an error if the list is empty prior to returning.