Since I am learning awk
; I found out FNR==NR
approach is a very common method to process two files. If FNR==NR
; then it is the first file
, when FNR
reset to 1
while reading every line from concatenated files it means !(FNR==NR)
and it is obviously the second file
.
When it comes to three or more files I can't see a way which is second and third file as both have the same !(FNR==NR)
condition. This made me to try to figure out how can there be something like FNR2
and FNR3
?
So I implemented a method to process three files in one awk
. Assuming like there is FNR1
FNR2
FNR3
for each file. For every file I made for loop
that runs seperately. Condition is same for every loop NR==FNR#
and actually get what I expected:
So I wonder if there are more sober, concise methods that deliver similar results with belowawk
code
Sample File Contents
$ cat file1
X|A1|Z
X|A2|Z
X|A3|Z
X|A4|Z
$ cat file2
X|Y|A3
X|Y|A4
X|Y|A5
$ cat file3
A1|Y|Z
A4|Y|Z
AWK for loop
$ cat fnrarray.sh
awk -v FS='[|]' '{ for(i=FNR ; i<=NR && i<=FNR && NR==FNR; i++) {x++; print "NR:",NR,"FNR1:",i,"FNR:",FNR,"\tfirst file\t"}
for(i=FNR ; i+x<=NR && i<=FNR && NR==FNR+x; i++) {y++; print "NR:",NR,"FNR2:",i+x,"FNR:",FNR,"\tsecond file\t"}
for(i=FNR ; i+x+y<=NR && i<=FNR && NR==FNR+x+y; i++) {print "NR:",NR,"FNR3:",i+x+y,"FNR:",FNR,"\tthird file\t"}
}' file1 file2 file3
Current and desired output
$ sh fnrarray.sh
NR: 1 FNR1: 1 FNR: 1 first file
NR: 2 FNR1: 2 FNR: 2 first file
NR: 3 FNR1: 3 FNR: 3 first file
NR: 4 FNR1: 4 FNR: 4 first file
NR: 5 FNR2: 5 FNR: 1 second file
NR: 6 FNR2: 6 FNR: 2 second file
NR: 7 FNR2: 7 FNR: 3 second file
NR: 8 FNR3: 8 FNR: 1 third file
NR: 9 FNR3: 9 FNR: 2 third file
You can see NR
is aligning with FNR#
and it is readable which NR
is for which file#
.
I found this method FNR==1{++f} f==1 {}
here Handling 3 Files using awk
But this method is replacing arr1[1]
when new line is read every time
Fail attempt 1
$ awk -v FS='[|]' 'FNR==1{++f} f==1 {split($2,arr); print arr1[1]}' file1 file2 file3
A1
A2
A3
A4
Success with for loop (arr1[1]
is not changed)
$ awk -v FS='[|]' '{for(i=FNR ; i<=NR && i<=FNR && NR==FNR; i++) {arr1[++k]=$2; print arr1[1]}}' file1 file2 file3
A1
A1
A1
A1
To identify files in order using GNU awk no matter what:
awk '
ARGIND == 1 { do 1st file stuff }
ARGIND == 2 { do 2nd file stuff }
ARGIND == 3 { do 3rd file stuff }
' file1 file2 file3
e.g. to get the text under "output" in your question from the 3 sample input files you provided:
awk '
ARGIND == 1 { pos = "first" }
ARGIND == 2 { pos = "second" }
ARGIND == 3 { pos = "third" }
{ print "NR:", NR, "FNR" ARGIND ":", NR, "FNR:", FNR, pos " file" }
' file1 file2 file3
NR: 1 FNR1: 1 FNR: 1 first file
NR: 2 FNR1: 2 FNR: 2 first file
NR: 3 FNR1: 3 FNR: 3 first file
NR: 4 FNR1: 4 FNR: 4 first file
NR: 5 FNR2: 5 FNR: 1 second file
NR: 6 FNR2: 6 FNR: 2 second file
NR: 7 FNR2: 7 FNR: 3 second file
NR: 8 FNR3: 8 FNR: 1 third file
NR: 9 FNR3: 9 FNR: 2 third file
or using any awk if all file names are unique whether any of them are empty or not:
awk '
FILENAME == ARGV[1] { do 1st file stuff }
FILENAME == ARGV[2] { do 2nd file stuff }
FILENAME == ARGV[3] { do 3rd file stuff }
' file1 file2 file3
or if the files aren't empty then whether unique or not (note file1
twice in the arg list):
awk '
FNR == 1 { argind++ }
argind == 1 { do 1st file stuff }
argind == 2 { do 2nd file stuff }
argind == 3 { do 3rd file stuff }
' file1 file2 file1
if a file names can appear multiple times in the arg list and some of the files could be empty then it becomes trickier with a non-GNU awk which is why GNU awk has ARGIND, e.g. something like (untested):
awk '
BEGIN {
for (i=1; i<ARGC; i++) {
fname = ARGV[i]
if ( (getline line < fname) > 0 ) {
# file is not empty so save its position in the args
# list in an array indexed by its name and the number
# of times that name has been seen so far
arginds[fname,++tmpcnt[fname]] = i
}
close(fname)
}
}
FNR == 1 { argind = arginds[FILENAME,++cnt[FILENAME]] }
argind == 1 { do 1st file stuff }
argind == 2 { do 2nd file stuff }
argind == 3 { do 3rd file stuff }
' file1 file2 file1