Search code examples
unixawk

Update lines of a file based on matches from another file


I am trying to use the 2nd columns of a refrence ref file as match criteria and change the first column of another main file for the lines matching the keyword. Updated file to be written to out. Below is an example. My shell script is not working. I appreciate your help.

ref

PSHELL       216  136738 
PSHELL       217  136738 
PSHELL      1786   13571 
PSHELL      1605  136513

main

PSHELL       216  136738 
PSHELL       218  136738 
PSHELL      1786   13571 
PSHELL      1610  136513
PSHELL      1612  136513

out

+PSHELL       216  136738 
PSHELL       218  136738 
+PSHELL      1786   13571 
PSHELL      1610  136513
PSHELL      1612  136513

my code

awk '
  FNR == NR {                         # if record from 1st file
    item[++n] = $2                    # store in indexed item array
    next                              # skip to next record
  }
  { main[++m] = $0 }                  # for 2nd file, store record in main
  END {                               # after all records processed
    for (i=1; i<=n; i++)              # loop over all items
      for(j=1; j<=m; j++) {           # loop over each record in main
        line = main[j]                # save record in line
        if (j == 2)                   # if 2nd record in main
          [$2]=+[$2]
        print line >> item[i]         # append line to item file 
      }
  }
' ref main > out

Error line 11: Syntax error


Solution

  • [$2]=+[$2] is not valid awk syntax. And in this case there is no reason to wait after the parsing of the second file (main) to print the outputs in an END block. You can print on the fly while parsing main. Anyway, for this kind of processing you should probably use an associative array instead of an indexed array, and the in operator to test if an element is a key of item:

    awk 'FNR==NR {item[$2]; next} $2 in item {printf("+")} 1' ref main > out
    

    Variant: you can also conditionally modify $0:

    awk 'FNR==NR {item[$2]; next} {$0=($2 in item?"+":"")$0; print}' ref main > out