Search code examples
regexstringbashgrepletter

How can I get a list of the words that have six or more consonants in a row using the grep command?


I want to find a list of words that contain six or more consonants in a row from a number of text files.

I'm pretty new to the Unix terminal, but this is what I have tried:

cat *.txt | grep -Eo "\w+" | grep -i "[^AEOUIaeoui]{6}"

I use the cat command here because it will otherwise include the file names in the next pipe. I use the second pipe to get a list of all the words in the text files.

The problem is the last pipe, I want to somehow get it to grep 6 consonants in a row, it doesn't need to be the same one. I would know one way of solving the problem, but that would create a command longer that this entire post.


Solution

  • You can use

    grep -hEio '[[:alpha:]]*[b-df-hj-np-tv-z]{6}[[:alpha:]]*' *.txt
    

    Regex details

    • [[:alpha:]]* - any zero or more letter
    • [b-df-hj-np-tv-z]{6} - six English consonant letters on end
    • [[:alpha:]]* - any zero or more letter.

    The grep options make the regex search case insensitive (i) and grep shows the matched texts only (with o) without displaying the filenames (h). The -E option allows the POSIX ERE syntax, else, if you do not specify it, you would need to escape {6} as \{6\},