I am trying to supply a bunch of files as input to snakemake
and wildcards are not working for some reason:
rule cluster:
input:
script = '/Users/nikitavlasenko/python_scripts/python/dbscan.py',
path = '/Users/nikitavlasenko/python_scripts/data_files/umap/{sample}.csv'
output:
path = '/Users/nikitavlasenko/python_scripts/output/{sample}'
shell:
"python {input.script} -data {input.path} -eps '0.3' -min_samples '10' -path {output.path}"
I want snakemake
to read files in from the umap
directory, get their names, and then use them to pass to the python script, so that each result would get a unique name. How this task can be achieved without such an error that I am getting right now:
Building DAG of jobs...
WorkflowError:
Target rules may not contain wildcards. Please specify concrete files or
a rule without wildcards.
Update
I found that most probably the rule all
is required at the top:
So I added it like that:
samples='SCID_WT_CCA'
rule all:
input:
expand('/Users/nikitavlasenko/python_scripts/data_files/umap/
{sample}_umap.csv', sample=samples.split(' '))
However, I am getting the following weird message:
Building DAG of jobs...
Nothing to be done.
So, it is not running.
Update
I thought that it could be related to the fact that I had just one sample name at the top, so I changed it to:
samples='SCID_WT_CCA WT SCID plus_1 minus_1'
And added the respective files, of course, but it did not fix this error.
Actually if I run snakemake cluster
I get the same error as at the very top, but if I just run snakemake
, then there is nothing to be done
error. I tried to substitute absolute paths for the relative ones, but it did not help:
samples='SCID_WT_CCA WT SCID plus_1 minus_1'
rule all:
input:
expand('data_files/umap/{sample}_umap.csv', sample=samples.split(' '))
rule cluster:
input:
script = 'python/dbscan.py',
path = 'data_files/umap/{sample}_umap.csv'
output:
path = 'output/{sample}'
shell:
"python {input.script} -data {input.path} -eps '0.3' -min_samples '10' -path {output.path}"
The "all" rule should have as input the list of files you want the other rule(s) to generate as output. Here, you seem to be using the list of your starting files instead.
Try the following:
samples = 'SCID_WT_CCA WT SCID plus_1 minus_1'
rule all:
input:
expand('output/{sample}', sample=samples.split(' '))
rule cluster:
input:
script = 'python/dbscan.py',
path = 'data_files/umap/{sample}_umap.csv'
output:
path = 'output/{sample}'
shell:
"python {input.script} -data {input.path} -eps '0.3' -min_samples '10' -path {output.path}"