Search code examples
bashfor-loopbioinformaticsblast

for loop is not working within my script


I made a script in bash that run blastx.

#!/bin/sh -l 
# $Id: Barrine.sh federicogaiti $

echo "Right now it is:"
date
echo ""
function usage() {
echo "Barrine.sh script by Federico Gaiti, March 2013."
echo ""
echo "Run Blastx"
echo ""
echo "Usage: "
echo "Barrine.sh <All contigs (fasta extension)> <Number of fasta split sequences> <Head file with PBS command where you set up account, walltime, etc..> "
echo ""
echo "Example: "
echo "Barrine.sh InputFasta n Head "
echo ""
exit 1
}

# Testing if the number of arguments is correct
if [ $# != 3 ]
then
    usage
        exit
    fi

### Declaring variables
InputFasta=$1    MY ORIGINAL FASTA FILE WITH ALL THE SEQUENCES
n=$2                 NUMBER OF FILES I WANT MY FASTA FILE SPLIT
Head=$3           PBS OPTIONS
n4=$((n/4))
n41=$((n/4 + 1))
n2=$((n/2))
n21=$((n/2 + 1))
n43=$((n/4 * 3))
n431=$((n/4*3 + 1))


echo Loading Modules
module load ucsc_utilities/20130122
wait


echo Split input FASTA in n FASTA file
faSplit sequence ${InputFasta} ${n} ${InputFasta}_Split_S 
wait



echo blastx command splitting it in 4 jobs to make it faster
    for i in {000..0$n4} ; do echo "blastx -query ${InputFasta}_Split_S$i.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out ${InputFasta}_blastx$i.csv" ; done > ${InputFasta}_Job1.sh


    for i in {0$n41..$n2} ; do echo "blastx -query ${InputFasta}_Split_S$i.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out ${InputFasta}_blastx$i.csv" ; done > ${InputFasta}_Job2.sh


    for i in {$n21..$n43} ; do echo "blastx -query ${InputFasta}_Split_S$i.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out ${InputFasta}_blastx$i.csv" ; done > ${InputFasta}_Job3.sh


    for i in {$n431..$n} ; do echo "blastx -query ${InputFasta}_Split_S$i.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out ${InputFasta}_blastx$i.csv" ; done > ${InputFasta}_Job4.sh

wait


echo Head the PBS commands to the Job files 
    for i in {1..4} 
        do 
        cat ${Head} ${InputFasta}_Job$i.sh | sed "s/BlastJob/BlastJob_$i/g" > ${InputFasta}_BlastJob$i.sh 
        done

I will then submit the 4 Jobs to barrine server. What I am supposed to obtain is 4 jobs containing this:

    blastx -query TEST_Split_S000.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx000.csv
blastx -query TEST_Split_S001.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx001.csv
blastx -query TEST_Split_S002.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx002.csv
blastx -query TEST_Split_S003.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx003.csv
blastx -query TEST_Split_S004.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx004.csv
blastx -query TEST_Split_S005.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx005.csv
blastx -query TEST_Split_S006.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx006.csv
blastx -query TEST_Split_S007.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx007.csv
............
blastx -query TEST_Split_S050.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx050.csv

But what I got in my script is just:

blastx -query TEST_Split_S{000..050}.fa -evalue 0.0001 -max_target_seqs 100 -db nr -num_threads 8 -outfmt '7 qseqid qlen sseqid pident length mismatch gapopen qstart qend sstart send ppos evalue bitscore score' -out TEST_blastx{000..050}.csv

Someone can help me in solving the problem? If I use the for loop outside my script works fine.

Thanks for help


Solution

  • Brace expansion happens early.

    Parameter expansion happens later, so you can't use variables or other parameters in a way that controls brace expansion.

    From the bash(1) man page:

    The order of expansions is: brace expansion, tilde expansion, parameter, variable and arithmetic expansion and command substitution (done in a left-to-right fashion), word splitting, and pathname expansion.

    This will prevent statements like for i in {000..0$n4} ; do from working as intended.

    Also, be sure to remember the -v and -x arguments to display the actual commands executed.