Search code examples
awkgrepwc

using the pipe and word count (wc) then dress the result


I'd like to dress the output of a grep I'm doing.

Imagine a file text.txt with a lot of text. Then I do the commands:

grep fred text.txt | wc -l
grep bob text.txt | wc -l
grep james text.txt | wc -l

I get the output:

12
3
4

What I would like to print as the output is:

fred was found on 12 lines.
bob was found on 3 lines.
james was found on 4 lines.

How can I do that?


Solution

  • In a shell script, using grep -c to count the lines:

    for name in fred bob james
    do
        echo "$name was found on $(grep -c $name text.txt) lines."
    done
    

    This runs half as many processes. This assumes you don't end up wanting to search for a name with spaces ('lucy anne') or quotes ("o'reilly") — if you need to be more general in the names you'll handle, you also need to be more careful about using double quotes around $name in the command substitution.

    However, you could scan the file once using awk (or Perl or Python, or …) which could be a big saving if the file is huge:

    awk '
        /fred/  { count["fred"]++ }
        /bob/   { count["bob"]++ }
        /james/ { count["james"]++ }
       END      { for (name in count) print name, "was found on", count[name], "lines." }
       ' text.txt
    

    That's similar to the answer by RavinderSingh13 but it counts only lines where the names occur, not the total number of occurrences (so if a line contains "bob was bobbing on the water all discombobulated", it will count 1 line, not 3 occurrences). Note that the search is neither case-insensitive ("Bob" won't be counted), nor constrained to match 'words' for any reasonable definition of word. These comments apply to the grep solution too, but you can also use options such as -i for case-insensitivity (from POSIX) and -w for matching words (GNU grep and some others, such as BSD and hence macOS X).