Search code examples
linuxbashfrequencyletters

Bash script to find the frequency of every letter in a file


I am trying to find out the frequency of appearance of every letter in the english alphabet in an input file. How can I do this in a bash script?


Solution

  • Just one awk command

    awk -vFS="" '{for(i=1;i<=NF;i++)w[$i]++}END{for(i in w) print i,w[i]}' file
    

    if you want case insensitive, add tolower()

    awk -vFS="" '{for(i=1;i<=NF;i++)w[tolower($i)]++}END{for(i in w) print i,w[i]}' file
    

    and if you want only characters,

    awk -vFS="" '{for(i=1;i<=NF;i++){ if($i~/[a-zA-Z]/) { w[tolower($i)]++} } }END{for(i in w) print i,w[i]}' file
    

    and if you want only digits, change /[a-zA-Z]/ to /[0-9]/

    if you do not want to show unicode, do export LC_ALL=C