Search code examples
bashfastq

Run multiple lines at the same time bash


So I'm working on a script in a cluster and i need to run this:

convert_fastaqual_fastq.py -f ITS_C1-5_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C3-2_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C3-3_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C3-5_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C4-5_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C5-1_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C5-4_rRNA.fq -c fastq_q_to_fastaqual
convert_fastaqual_fastq.py -f ITS_C5-5_rRNA.fq -c fastq_q_to_fastaqual

As you can see, each line is different and each line takes like 2 days to run. The command what it does is transform a sample into two different formats but does it from sample to sample. What I want is that when I run the script all the samples run at the same time, simultaneously.

An unsightly solution would be to generate a file for each sample and run it one at a time on separate CPUs in the cluster. I want a single job running at the same time in parallel.

Thanks!


Solution

  • You can do something like this to start each job in the background and then wait for all of them. The pattern ITS_C?-?_rRNA.fq will match each of the jobs that you've specified.

    i=0
    for file in ITS_C?-?_rRNA.fq; do
        convert_fastaqual_fastq.py \
            -f "$file" -c fastq_q_to_fastaqual \
            1>> job"$i".out \
            2>> job"$i".err &
        ((i++))
    done
    
    wait
    

    If you want more information about which jobs failed, you can do something like this instead of wait (note that this requires you to keep track of which pid is associated with which file)

    for job in $(jobs -p); do
        if wait "$job"; then
            printf "job %s succeeded\n" "$job"
        else
            printf "job %s failed\n" "$job"
        fi
    done