I need to build a regular expression for strings in which the +/- characters cannot stand side by side (must be separated by some other). I got this option: (a*(+|-)a)*
, where a
is any character, *
is the Kleene closure, ()
is for clarity. But this expression does not recognize lines of the form: "+", "-","+ a-" etc. Maybe someone will be able to move me from the dead point. I need regularity to build a finite automaton.
That might do:
^(\+(?![\+-])|-(?![\+-])|[^\+-])*?$
It bounds everything to the single line: ^
to $
. The lazy quantifier ()*?
ensures that no more then one line is going to be recognized.
The 3 concatenations inside the parentheses are as follows:
\+(?![\+-])
if the character is +
the next one must not be +
or -
-(?![\+-])
if the character is -
the next one must not be +
or -
[\+-](?![\+-])
.[^\+-]
any character that is not +
and -
However, you must know that a regex
is more powerful than a regular expression
. You need a regular grammar more than a regular expression:
S = +T
S = -T
S = @S
S = ε
T = @S
T = ε
This grammar is right-regular, and @
is any character that is not +
nor -
. ε
is epsilon = nothing.
Here is the deterministic finite automaton (where P=+
M=-
@=not + and not -
):
/---- @ ----\
| |
| v
(S = start/final)--- P,M -->(T = final)---- P,M --->(error)
^ |
| |
\----------- @ ------------/