Search code examples
perlshellgrepcomm

how to extract common lines across multiple file?


I have 15 different files that I want have a new file which include only common lines in all of them. for example:

File1:

id1
id2
id3

file2:

id2
id3
id4

file3:
id10
id2
id3

file4

id100
id45
id3
id2

I need the output be like:

newfile:

id2 
id3

I know this command works for each pair of files:

grep -w -f file1 file2 > output

but i need a command to works for more than 2 files.

any suggestion please?


Solution

  • Using grep

    The same trick can be used more than once:

    $ grep -w -f file1 file2 | grep -w -f file3 | grep -w -f file4
    id2
    id3
    

    By the way, if you are looking for exact matches, not a regular expression matches, it is better and faster to use the -F flag:

    $ grep -wFf file1 file2 | grep -wFf file3 | grep -wFf file4
    id2
    id3
    

    Using awk

    $ awk 'FNR==1{nfiles++; delete fseen} !($0 in fseen){fseen[$0]++; seen[$0]++} END{for (key in seen) if (seen[key]==nfiles) print key}' file1 file2 file3 file4
    id3
    id2
    
    • FNR==1{nfiles++; delete fseen}

      Every time that we start reading a new file, we do two things: (1) increment the file counter, nfiles. and (2) delete the array fseen.

    • !($0 in fseen){fseen[$0]; seen[$0]++}

      If the current line is not a key in fseen, then add it to fseen and increment the count for this line in seen.

    • END{for (key in seen) if (seen[key]==nfiles) print key}

      After we have read the last line of the last file, we look at every key in seen. If the count for that key is equal to the number of files that we have read, nfiles, then we print that key.