Search code examples
awkfile-comparison

awk to compare multiple columns in 2 files


I would like to compare multiple columns from 2 files and NOT print lines matching my criteria. An example of this would be:

file1

apple  green  4
orange  red  5
apple  yellow 6
apple  yellow 8
grape  green 5

file2

apple  yellow 7
grape  green 10

output

apple  green  4
orange  red  5
apple  yellow 8

I want to remove lines where $1 and $2 from file1 correspond to $1 and $2 from file2 AND when $3 from file1 is smaller than $3 from file2. I can now only do the first part of the job, that is remove lines where $1 and $2 from file1 correspond to $1 and $2 from file2 (fields are separated by tabs):

awk -F '\t' 'FNR == NR {a[$1FS$2]=$1; next} !($1FS$2 in a)' file2 file1

Could you help me apply the last condition?

Many thanks in advance!


Solution

  • What you are after is this:

    awk '(NR==FNR){a[$1,$2]=$3; next}!(($1,$2) in a) && a[$1,$2] < $3))' <file2> <file1>