Search code examples
unixawk

How to get max length of each column in unix?


Suppose, I have a source file like this.

ID|NAME|ADDRESS
1|ABC|PUNE
2|XYZA|MUMBAI
12|VB|NAGPUR

I want to get the maximum length of each column (excluding the header names). Output should be like this. 2|4|6

I have tried the command like this. tail +2 filename | cut -d"|" -f1 | awk '{ print length }' | sort -r | uniq

This works for 1st column. Is there any option available in awk to get max length for each column?


Solution

  • This can be a general way to do it, so that you don't have to care about the number of fields you have: store the lengths in an array and keep checking if it is the maximum or not. Finally, loop through them and print the results.

    awk -F'|' 'NR>1{for (i=1; i<=NF; i++) max[i]=(length($i)>max[i]?length($i):max[i])}
               END {for (i=1; i<=NF; i++) printf "%d%s", max[i], (i==NF?RS:FS)}' file
    

    See output:

    $ awk -F'|' 'NR>1{for (i=1; i<=NF; i++) max[i]=(length($i)>max[i]?length($i):max[i])} END {for (i=1; i<=NF; i++) printf "%d%s", max[i], (i==NF?RS:FS)}' a
    2|4|6
    

    For variable number of columns, we can store the maximum amount of columns in for example cols:

    $ awk -F'|' 'NR>1{cols=(cols<=NF?NF:cols); for (i=1; i<=NF; i++) max[i]=(length($i)>max[i]?length($i):max[i])} END {for (i=1; i<=cols; i++) printf "%d%s", max[i], (i==cols?RS:FS)}' a
    2|4|6