Search code examples
linuxbashgzipbzip2fastq

How to merge zcat and bzcat in a single function


I would like to build a little helper function that can deal with fastq.gz and fastq.bz2 files.

I want to merge zcat and bzcat into one transparent function which can be used on both sorts of files:

zbzcat example.fastq.gz
zbzcat example.fastq.bz2


zbzcat() {
  file=`echo $1 | `
## Not working
  ext=${file##*/};
  
  if [ ext == "fastq.gz" ]; then
    exec gzip -cd "$@"  
  else
    exec bzip -cd "$@"  
  fi
}

The extension extraction is not working correctly. Are you aware of other solutions


Solution

  • These are quite a lot of problems:

    • file=`echo $1 | ` gives a syntax error because there is no command after |. But you don't need the command substitution anyways. Just use file=$1.
    • ext=${file##*/} is not extracting the extension, but the filename. To extract the extension use ext=${file##*.}.
    • In your check you didn't use the variable $ext but the literal string ext.
    • Usually, only the string after the last dot in a filename is considered to be the extension. If you have file.fastq.gz, then the extension is gz. So use the check $ext = gz. That the uncompressed files are fastq files is irrelevant to the function anyways.
    • exec replaces the shell process with the given command. So after executing your function, the shell would exit. Just execute the command.

    By the way: You don't have to extract the extension at all, when using pattern matchting:

    zbzcat() {
      file="$1"
      case "$file" in
        *.gz) gzip -cd "$@";;
        *.bz2) bzip -cd "$@";;
        *) echo "Unknown file format" >&2;;
      esac
    }
    

    Alternatively, use 7z x which supports a lot of formats. Most distributions name the package p7zip.