Search code examples
awkreportxargs

How to rewrite a Awk script to process several files instead of one


I am writing a report tool which processes the source files of some application and produce a report table with two columns, one containing the name of the file and the other containing the word TODO if the file contains a call to some deprecated function deprecated_function and DONE otherwise.

I used awk to prepare this report and my shell script looks like

report()
{
  find . -type f -name '*.c' \
    | xargs -n 1 awk -v deprecated="$1" '
BEGIN { status = "DONE" }
$0 ~ deprecated{ status = "TODO" }
END {
  printf("%s|%s\n", FILENAME, status)
}'
}
report "deprecated_function"

The output of this script looks like

./plop-plop.c|DONE
./fizz-boum.c|TODO

This works well but I would like to rewrite the awk script so that it supports several input files instead of just one — so that I can remove the -n 1 argument to xargs. The only solutions I could figure out involve a lot of bookkeeping, because we need to track the changes of FILENAME and the END event to catch each end of file event.

awk -v deprecated="$1" '
BEGIN { status = "DONE" }
oldfilename && (oldfilename != FILENAME) {
  printf("%s|%s\n", oldfilename, status);
  status = DONE;
  oldfilename = FILENAME;
}
$0 ~ deprecated{ status = "TODO" }
END {
  printf("%s|%s\n", FILENAME, status)
}'

Maybe there is a cleaner and shorter way to handle this.

I am using FreeBSD's awk and am looking for solutions compatible with this tool.


Solution

  • This will work in any modern awk:

    awk -v deprecated="$1" -v OFS='|' '
        $0 ~ deprecated{ dep[FILENAME] }
        END {
            for (i=1;i<ARGC;i++)
                print ARGV[i], (ARGV[i] in dep ? "TODO" : "DONE")
        }
    ' file1 file2 ...
    

    Any time you need to produce a report for all files and don't have GNU awk for ENDFILE, you MUST loop through ARGV[] in the END section (or loop through it in BEGIN and populate a different array for END section processing). Anything else will fail if you have empty files.