Search code examples
pythoncondapipelinesnakemake

How to refer to executable inside anaconda environment in Snakemake


I'm using vcf2maf to annotate variants as part of a snakemake pipeline

rule vcf2maf:
    input:
        vcf="vcfs/{sample}.vcf",
        fasta=vep_fasta,
        vep_dir=vep_dir
    output:
        "mafs/{sample}.maf"
    conda:
        "../envs/annotation.yml"
    shell:
        """
        vcf2maf.pl --input-vcf {input.vcf} --output-maf {output} \
            --tumor-id {wildcards.sample}.tumor \
            --normal-id {wildcards.sample}.normal \
            --ref-fasta {input.fasta} --filter-vcf 0 \
            --vep-data {input.vep_dir} --vep-path [need path]

        """

The conda environment has two packages: vcf2maf and vep. vcf2maf requires a path to vep to run properly, but I'm not sure how to access vep's path since it's stored inside the conda environment which will have a user specific absolute path. Is there an easy way to get vep's path so I can refer to it for --vep-path?


Solution

  • You could use the unix which command like:

    veppath=`which vep`
    vcf2maf.pl --vep-path $veppath ...
    

    [vep path is] stored inside the conda environment which will have a user specific absolute path

    The variable CONDA_PREFIX contains the path to the current conda environment. so you could also do something like:

    vcf2maf.pl --vep-path $CONDA_PREFIX/bin/vep ...