Search code examples
linuxawkuniquegnu

Finding the rows sharing information


I have a file having a structure like below:

file1.txt:

1 10 20 A
1 10 20 B
1 10 20 E
1 10 20 F
1 12 22 C
1 13 23 X
2 33 45 D
2 48 49 D
2 48 49 E

I am trying to find out, which letters have the same information in the 1st,2nd,3rd columns? For example the output should be:

A
B
E
F
D
E

I am only able to count how many lines are unique via:

cut -f1,2,3 file1.txt | sort | uniq | wc -l 
5

which does not give me anything related with the 4th column.

How do I have the letters in the forth column sharing the first three columns?


Solution

  • Following awk may help you here.

     awk 'FNR==NR{a[$1,$2,$3]++;next}  a[$1,$2,$3]>1' Input_file  Input_file
    

    Output will be as follows.

    1 10 20 A
    1 10 20 B
    1 10 20 E
    1 10 20 F
    2 48 49 D
    2 48 49 E
    

    To get only the last field's value change a[$1,$2,$3]>1 to a[$1,$2,$3]>1{print $NF}'