Search code examples
bashshellfastq

Some tips to improve a bash script for count fastq files


Hi guys I got this bash one line that i wish to make a script

for i in 'ls *.fastq.gz'; do echo $(zcat ${i} | wc -l)/4|bc; done

I would like to make it as a script to read from a data dir and print out the result with the name of the file.

I tried to put the dir in front of the 'data/*.fastq.gz' but got am error No such dir exist...

I would like some like this:

name1.fastq.gz 1898516
name2.fastq.gz 2467421
namen.fastq.gz 1234532

I am not experienced in bash.

Could you guys give a help?
Thanks


Solution

  • There is no need to escape to bc for integer math (divide by 4), or to use 'ls' to enumerate the files. The original version will do with minor changes:

    #!/bin/bash
    
    dir="${1-.}"
    
    for i in "$dir"/*.fastq.gz; do
      lines=$(zcat "${i}" | wc -l)
      printf '%s %d\n' "$i" "$((lines/4))"
    done