So I did an exercise using jflex, which is about counting the amount of words from an input text file that contains more than 3 vowels. What I end up doing was defining a token for word, and then creating a java function that receives this text as input, and check each character. If its a vowel I add up the counter and then I check if its greater than 3, if it is I add up the counter of the amount of words.
What I want to know, if there is a regexp that could match a word with more than 3 vowels. I think it would be a cleaner solution. Thanks in advance.
tokens
Letra = [a-zA-Z]
Palabra = {Letra}+
Very simple. Use this if you want to check that a word contains at least 3 vowels.
(?i)(?:[a-z]*[aeiou]){3}[a-z]*
You only care it that contains at least 3 vowels, so the rest can be any alphabetical characters. The regex above can work in both String.matches
and Matcher
loop, since the valid word (contains at least 3 vowels) cannot be substring of an invalid word (contains less than 3 vowels).
Out of the question, but for consonant, you can use character class intersection, which is a unique feature to Java regex [a-z&&[^aeiou]]
. So if you want to check for exactly 3 vowels (for String.matches
):
(?i)(?:[a-z&&[^aeiou]]*[aeiou]){3}[a-z&&[^aeiou]]*
If you are using this in Matcher loop:
(?i)(?<![a-z])(?:[a-z&&[^aeiou]]*[aeiou]){3}[a-z&&[^aeiou]]*(?![a-z])
Note that I have to use look-around to make sure that the string matched (exactly 3 vowels) is not part of an invalid string (possible when it has more than 3 vowels).