I am trying to use the 2nd columns of a refrence ref
file as match criteria and change the first column of another main
file for the lines matching the keyword. Updated file to be written to out
. Below is an example. My shell script is not working. I appreciate your help.
ref
PSHELL 216 136738
PSHELL 217 136738
PSHELL 1786 13571
PSHELL 1605 136513
main
PSHELL 216 136738
PSHELL 218 136738
PSHELL 1786 13571
PSHELL 1610 136513
PSHELL 1612 136513
out
+PSHELL 216 136738
PSHELL 218 136738
+PSHELL 1786 13571
PSHELL 1610 136513
PSHELL 1612 136513
my code
awk '
FNR == NR { # if record from 1st file
item[++n] = $2 # store in indexed item array
next # skip to next record
}
{ main[++m] = $0 } # for 2nd file, store record in main
END { # after all records processed
for (i=1; i<=n; i++) # loop over all items
for(j=1; j<=m; j++) { # loop over each record in main
line = main[j] # save record in line
if (j == 2) # if 2nd record in main
[$2]=+[$2]
print line >> item[i] # append line to item file
}
}
' ref main > out
Error line 11: Syntax error
[$2]=+[$2]
is not valid awk
syntax. And in this case there is no reason to wait after the parsing of the second file (main
) to print the outputs in an END
block. You can print on the fly while parsing main
. Anyway, for this kind of processing you should probably use an associative array instead of an indexed array, and the in
operator to test if an element is a key of item
:
awk 'FNR==NR {item[$2]; next} $2 in item {printf("+")} 1' ref main > out
Variant: you can also conditionally modify $0
:
awk 'FNR==NR {item[$2]; next} {$0=($2 in item?"+":"")$0; print}' ref main > out