I'm new to snakemake and to using clusters, so I would appreciate any help!
I have a snakefile that works fine on a server, but when I try to run it on the cluster I have not found the proper commands to submit a job and have it execute. It "stalls" like other users have found. https://groups.google.com/forum/#!searchin/snakemake/cluster|sort:relevance/snakemake/dFxRIgKDxUU/od9az3MuBAAJ
I am running it on an SGE cluster where there is only one node (the head node) that we submit jobs through. We can't run jobs interactively or run intensive commands on the head node. Usually I would run a bwa command like so:
qsub -V -b y 'bwa mem -t 20 /reference/hg38.fa in/R_1.fastq in/R_2.fastq |samtools view -S -bh -@ 7 > aln_R.bam'
So I followed the FAQ about submitting jobs on the cluster via the head node which suggests this code :
qsub -N PIPE -cwd -j yes python snakemake --cluster "ssh user@headnode_address 'qsub -N pipe_task -j yes -cwd -S /bin/sh ' " -j
This did not work for me because my terminal expected python to be a file. To actually invoke the program's command, I had to use this:
qsub -V -N test -cwd -j y -b y snakemake --cluster "qsub " -j 1
The -b y allows for both binary or as a script. If I run this, qstat will show the program running, but there is an internal error and it never finishes.
Also, the contents inside "qsub " are treated like snakemake commands. When I try to use sge flags such as -j y, I have errors from snakemake along the lines of this:
qsub -V -N test -cwd -j y -b y snakemake --cluster "qsub -j y" -j 1
snakemake: error: argument --cores/--jobs/-j: invalid int value: 'y'
I can submit the snakemake shell scripts in the tmp file perfectly fine, but I can't use the -b y flag and have added the -S /bin/bash flag. So the scripts themselves work, but I think the way they are being pushed to the cluster from the head node is not working somehow. I could be totally off target as well! I would love any direction about how to talk about the SGE to my sys-admins, because I don't really know what to ask them about my problem.
In conclusion: Has anyone else come across the need to invoke -b y for snakemake --cluster to run on SGE? And has it also treated "qsub" as a snakemake command? Or does anyone have another workaround for submitting jobs on the head node for SGE? What questions should I ask my SGE sys-admins?
When you say you can't use nodes interactively are you sure your cluster admins have banned the use of qrsh and qlogin as well as ssh? Those two commands submit jobs to the cluster that can give you an interactive shell but are under the control of SGE.
My suspicion is that you are running into an issue with double parsing of the command line. Once on job submission and once when SGE is trying to start your command. Rather than trying to submit the whole thing as a command line write your snakekmake command in a shell file and submit that (without -b y)
#!/bin/sh
#$ -S /bin/sh
exec python snakemake -j 1 --cluster "qsub -j y"
Alternatively create a wrapper script that embeds the options you want snakemake to use when invoking qsub for subordinate jobs.
#!/bin/sh
exec qsub -j y "$@"
Then tell snakemake to use that:
qsub -V -N test -cwd -j y -b y snakemake -j 1 --cluster "wrapper"
Alternatively play around with the command lines you've adding extra layers of escaping and quoting until it works.