Search code examples
loopsawkiterationcycle

Fail to cycle multiple input files with awk/gawk


I have a ton of files in subfolders, each containing three columns of numbers. I need to locate the largest number in $2 and then print columns $1 and $2.

This is what I got:

awk 'FNR > 1 {max=dist=0; if($2>max){dist=$1; max=$2}}END{print FILENAME "   distance: " dist "   max: " max}' ./nVT_*K/rdf_rdf_aam_aam_COM.dat

This works, however only prints values for the last input file. I need one from each.

Iterating using a bash for loop produced a "command not found" for the awk part. I am currently piping the echoed for loop output to a file and running as a script, though this is not a feasible plan in the long run.

Can anyone help toss this around so that it can take a bunch of input files in different subfolders and printing the intended result from each file as such:

./nVT_277K/rdf_rdf_aam_aam_COM.dat   distance: 4.650000   max: 1.949975
./nVT_283K/rdf_rdf_aam_aam_COM.dat   distance: 4.650000   max: 1.943047
./nVT_289K/rdf_rdf_aam_aam_COM.dat   distance: 4.650000   max: 1.907280
...
...
...

I'd be extremely grateful for any input here. Thanx


Solution

  • With GNU awk for ENDFILE:

    awk '
        FNR > 1 { if ((max=="") || ($2>max)) {dist=$1; max=$2} }
        ENDFILE { print FILENAME "   distance: " dist "   max: " max; max=dist="" }
    ' ./nVT_*K/rdf_rdf_aam_aam_COM.dat
    

    With any awk and assuming your inputs files are not empty:

    awk '
        FNR==1 { if (NR>1) print fname "   distance: " dist "   max: " max; max=dist=""; fname=FILENAME; next }
        (max=="") || ($2>max) {dist=$1; max=$2} }
        END { print fname "   distance: " dist "   max: " max }
    ' ./nVT_*K/rdf_rdf_aam_aam_COM.dat