Search code examples
pythonregexcluster-computingfinal

Listing word final consonants cluster in German


I wrote a program that finds and counts initial consonant clusters in German and Spanish texts. I want a regex that will find clusters on final positions. Using \b or $ does not work. Can someone help me determine how I should change my regex so that it will work for final consonants clusters?

I currently have sth like this for initial clusters:

for w in words:
    initial = re.search('^([^aeiouy]*)[aeiouy]',w)

Or sth like this:

 initial = re.search('^[^aeiouy]{2,}',w)

Solution

  • You seem to want to extract chunks of 2 or more consonant letters at the end of the string.

    You may use

    (?:(?![aeiou])[a-z]){2,}$
    

    See the regex demo.

    Details

    • (?: - start of a non-capturing group:
      • (?![aeiou]) - a negative lookahead that fails the match if the next char is a vowel
      • [a-z] - an ASCII letter (case insensitive mode can be set with re.I flag)
    • ){2,} - end of the group, 2 or more occurrences
    • $ - end of string.