Search code examples
awkcomparisonfielddiffcomm

Compare fields with exactly the same name


I would like to find the same elements in the first field of two files. I am familiar with the awk command

awk 'FNR==NR{a[$1]++;next}a[$1]' file1 file2

But, this does not work if a field contains multi-word expressions that contain one common element. For example, my file1 is:

blue and red    20.5
red and green   13.4
yellow and black    10
blue and black  17.2
black and green 21

And my file2 is:

blue and yellow 18
red and green   11.9
yellow and orange   8
brown and black 6.9
organge and yellow  17

The above command will produce the following list:

blue and red    20.5
red and green   13.4
yellow and black    10
brown and black 6.9

And I would like to have an exact match only

red and green   13.4

Solution

  • If your data are tab separated, you should let awk know about that, otherwise, awk cannot work on those data correctly.

    try this:

    awk -F'\t' 'FNR==NR{a[$1]++;next}a[$1]' file1 file2