I would like to detect if a string contains multiple different words and would like to limit the number of words. Words all kinds of characters, except spaces.
E.g.: I want to check if the following strings have no more than three distinct words:
lorum -> True
lorum ipsum -> True
lorum ipsum dolor -> True
lorem lorem ipsum dolor ipsum ipsum -> True
lorem lorem <=> -> True
1 2 3 -> True
lorem ipsum dolor sit lorum -> False
lorem ipsum dolor sit -> False
1 2 3 4 -> False
To my great surprise this is actually achievable with regular expression. This is really ugly and inefficient, but it works.
You should probably not use it though: this is not the right tool for this job.
/^(\S*)(?: \1)*(?:(?: (\S*))(?: \1| \2)*(?: (\S*))?)?(?: \1| \2| \3)*$/gm