Search code examples
pythonregexrethinkdbre2

Regex for just whole word (RE2)


I am little bit lost in regex with rethinkdb. I tried so many possibilities, but the solution is always bad. I need to find in string just one the word and not other its shape. Usually, I am using for this just re.search and "\bword\b" < this work in TinyDB perfectly. For example, I looking for word "les":

les < I need match
lesík < NO
za-les-nit < I need match
les'ns < I need match
odlesnit < NO
useless < NO
doles < NO
kolem je les, který ... < I need match

Like I wrote, I have good solution for TinyDB and its search regex function, but RethinkDB need something different. Maybe it is because of RE2, I dont know. Please help me someone. PS: If you know some RE2 online helper, send me link too. Thanks a lot.


Solution

  • You may try:

    (?:^|[[:punct:]]| )les(?:[[:punct:]]| |$)
    

    Explanation of the above regex:

    • (?:) - Represents a non-capturing group.
    • ^, $ - Represents start and end of the line respetively.
    • | - Represents alternation.
    • (?:^|[[:punct:]]| ) - This ensures that only les appears at the start of the line or after the punctuation or after a white-space. If white-space can come multiple times you can use \s+.
    • (?:[[:punct:]]| |$) - This part of the regex ensures that les should only be followed by a punctuation or a white-space or the end of line.

    Pictorial Representation

    You can find the demo of the above regex below.

    RE2 demo