Search code examples
linuxshellunixgzipbzip2

How to use awk for a compressed file


How can I change the following command for a compressed file?

awk 'FNR==NR { array[$1,$2]=$8; next } ($1,$2) in array { print $0 ";" array[$1,$2] }' input1.vcf input2.vcf

The command working fine with normal file. I need to change the command for compressed files.


Solution

  • You need to read them compressed files like this:

    awk '{ ... }' <(gzip -dc input1.vcf.gz) <(gzip -dc input2.vcf.gz)
    

    Try this:

    awk 'FNR==NR { sub(/AA=\.;/,""); array[$1,$2]=$8; next } ($1,$2) in array { print $0 ";" array[$1,$2] }' <(gzip -dc input1.vcf.gz) <(gzip -dc input2.vcf.gz) | gzip > output.vcf.gz