Search code examples

Searching file for a string in the first field depending on the input from another file and piping the result to new file

I have an input file like below

 Model related text
 Model specifications
 $#   eid     pid   n1   n2   n3   n4   n5   n6      n7    n8
 76737    1    79322  79323   79324   79511     0       0       0       0
 76738    1   79510   79203   79204   79512     0       0       0       0
 76739    1   79511   79324   79325   79513     0       0       0       0
 76740    1   79512   79204   79205   79514     0       0       0       0
 76741    1   79514   79205   79206   79515     0       0       0       0
 76742    1   79515   79206   79207   79516     0       0       0       0
 76743    1   79516   79207   79208   79517     0       0       0       0
 76744    1   79517   79208   79209   79518     0       0       0       0
 76745    1   79518   79209   79210   79519     0       0       0       0
 76746    1   79519   79210   79211   79520     0       0       0       0

In another file File 2 I have only numbers like


I have to compare these each numbers from File2.txt with the numbers in the first line of the File1.txt and if it matches, the complete line from File1.txt would be output to model.txt The output would be

 Model related text
 Model specifications
 $#   eid     pid   n1   n2   n3   n4   n5   n6      n7    n8
 76737    1    79322  79323   79324   79511     0       0       0       0
 76738    1   79510   79203   79204   79512     0       0       0       0
 76739    1   79511   79324   79325   79513     0       0       0       0
 76740    1   79512   79204   79205   79514     0       0       0       0
 76741    1   79514   79205   79206   79515     0       0       0       0

can anybody suggest me with AWK, SED etc?


  • This can be very easily done using awk

    awk 'FNR==NR{ value[$1]; next} $1 in value || FNR < 5' 


    $ awk 'FNR==NR{ value[$1]; next} $1 in value || FNR < 5' file2 file1
    Model related text
    Model specifications
    $#   eid     pid   n1   n2   n3   n4   n5   n6      n7    n8
    76737    1    79322  79323   79324   79511     0       0       0       0
    76738    1   79510   79203   79204   79512     0       0       0       0
    76739    1   79511   79324   79325   79513     0       0       0       0
    76740    1   79512   79204   79205   79514     0       0       0       0
    76741    1   79514   79205   79206   79515     0       0       0       0

    If you are not interested in the leading headers in the output, the script can be further simplified as

    awk 'FNR==NR{ value[$1]; next} $1 in value' file2 file1
    76737    1    79322  79323   79324   79511     0       0       0       0
    76738    1   79510   79203   79204   79512     0       0       0       0
    76739    1   79511   79324   79325   79513     0       0       0       0
    76740    1   79512   79204   79205   79514     0       0       0       0
    76741    1   79514   79205   79206   79515     0       0       0       0

    What it does?

    • FNR==NR Checks if the number of records read from the current file is equal to total number of records read. Basically this evaluates true only for the first file, that is here for file2

    • value[$1]; next Creates an associative array indexed by $1, the value from the file2

    • $1 in value checks if the column 1 is present in the associative array


    Print only the first occurence.

    You can use delete to remove the entry from the associative array once the line has been printed. This ensures that the line is not printed for the second occurence.

    awk 'FNR==NR{ value[$1]; next} $1 in value{ print; delete value[$1] }'