Search code examples
regexregular-language

How to write a generical REGEX expression to: Σ = {a, b, c}, L = {w ∈ Σ ∗ / the first symbol of w is equal to the last symbol of w}?


I tried ^(a|b|c)?(a|b|c)*\1$ and some variations but for some reason words like a or b aren't captured. Could you guys explain me why?

I also don't get why in the expression ^(a|b|c)?(a|b|c)\1$ the word cac is captured. I mean, I thought in this expression the word should just have 2 letters.


Solution

  • No string less than length 2 can match ^(a|b|c)?(a|b|c)*\1$

    • (a|b|c)* matches 0 or more characters
    • \1 is required and matches exactly 1 character
    • if \1 matches, there must be an additional (a|b|c) at the start of the line

    ^(a|b|c)?(a|b|c)\1$ can only be matched by strings of exactly length 3:

    • The second (a|b|c) matches exactly 1 character
    • \1 is required and matches exactly 1 character
    • if \1 matches, there must be an additional (a|b|c) at the start of the line