Search code examples
unixawk

How to Skip 1st line of file - awk


I am beginner to awk. I have created one file which contains employee information. There are employees in different departments. And i wanna count that how many employees in each department. like

marketing        3
sales            3
production       4

For that i used following command.

awk 'NR>1 {dept=$5} {count[dept]++} END {for (dept in count) {print dept count[dept]}}' emp

But above code it count and displays the first line i.e header also. like

marketing 3
sales 3
department 1
production 4

where department is a header of column which is also counted although i used NR>1.. And how to add space or increase the width of all columns.. because it looks like above output.. but i wanna display it properly.. So any solution for this?

Here is my input file

empid       empname     department
101         ayush    sales
102         nidhi    marketing
103         priyanka    production  
104         shyam    sales
105         ami    marketing
106         priti    marketing
107         atuul    sales
108         richa    production
109         laxman    production
110         ram     production

Solution

  • Use GNU printf for proper tab-spaced formatting

    awk 'NR>1 {count[$3]++} END {for (dept in count) {printf "%-15s%-15s\n", dept, count[dept]}}' file
    

    You can use printf with width options as below example if printf "%3s"

    • 3: meaning output will be padded to 3 characters.

    From man awk, you can see more details:

    width   The field should be padded to this width. The field is normally padded
            with spaces. If the 0  flag  has  been  used, it is padded with zeroes.
    
    .prec   A number that specifies the precision to use when printing.  For the %e,
            %E, %f and %F, formats, this specifies the number of digits you want
            printed to the right of the decimal point. For the %g, and %G formats,
            it specifies the maximum number of significant  digits. For the %d, %o,
            %i, %u, %x, and %X formats, it specifies the minimum number of digits to
            print. For %s, it specifies the maximum number of characters from the
            string that should be printed.
    

    You can add the padding count as you need. For the input file you specified

    $ awk 'NR>1 {count[$3]++} END {for (dept in count) {printf "%-15s%-15s\n", dept, count[dept]}}' file
    production     4
    marketing      3
    sales          3