Search code examples
awkgrepwhitelist

In a file with two word lines, grep only those lines which have both words from a whitelist


I have a file1:

green
yellow
apple
mango

and a file2:

red apple
blue banana
yellow mango
purple cabbage

I need to find elements from file2 where both words belong to the list in file1. So it should show:

yellow mango

I tried:

awk < file2 '{if [grep -q $1 file1] && [grep -q $2 file1]; then print $0; fi}'

I am getting syntax error.


Solution

  • This will do the trick:

    $ awk 'NR==FNR{a[$0];next}($1 in a)&&($2 in a)' file1 file2 
    yellow mango
    

    Explanation:

    NR is a special awk variable the tracks the current line in the input and FNR tracks the current line in each individual file so the condition NR==FNR is only true when we are in the first file. a is a associative array where the keys are each unique line in the first file. $0 is the value of the current line in awk. The next statement jumps to the next line in file to the next part of skip is not executed. The final part is straight forward if the first field $1 is in the array a and the second field then print the current line. The default block in awk is {print $0} so this is implicit.