I need to prove or disprove the below REGEX
(RS + R )* R = R (SR + R)*
// or, for programmers:
/(RS|R)*R/ == /R(SR|R)*/
I have a strong gut feel that they are equivalent, but how do i give a step by step proof using the laws of REGEX.
First understand what this formal languages mean:
(RS + R)*R = R(SR + R)*
From LHS, (RS + R)*
uses to generate any combinations of RS
and R
including ^
epsilon. Some example strings are {^, RS, RSRS, RRRS, RSR,...}
: strings always starts from R
but can end with either S
or R
- we can describe in English: R
can appear in any combination where S
is always followed by one R
(two consecutive S
are not possible).
And, complete LHS's re (RS + R)*R
means string always terminates with R
.
Now, consider following examples:
R + S
is same as S + R
, it is basically union RS
can't be written as SR
, order is important in concatenation (RS)R
can be written as R(SR)
(RS)*R
can be written as R(SR)*
, both are same that is RSRSRS...SR
(AB + AC)
can be written as A(B + C)
(AB + A)
can be written as A(B + ^)
, this is because A = ^A = A^
(BA + A)
can be written as (B + ^)A
.Formal Proof:
(RS + R)*R // LHS => (R(S + ^))*R // from rule 6 => R((S + ^)R)* // from rule 4 => R(SR + R)* // from rule 7, in revers `(B + ^)A` --> `(BA + A)` // RHS
Same steps are correct for regex.