Search code examples
regexfinite-automataregular-languageequivalence

How should one proceed to prove (or find) if two regular expressions are same or equivalent?


For example, in an assignment given to me, we were asked to find out if two regular expressions are equal or not.

(a+b+c)*  and ((ab)**c*)*

My question is how is one supposed to do that? If I draw the transition graphs for both and then run a few strings through it and show that both of the TGs are able to accept it, is that a sufficient proof ? If not, how do I do it? Is there a mathematical/axiomatic approach towards this?

Thanks in advance.

EDIT: There is another thing that I'd like to clear which is kind of related to this question. Are the two FAs depicted in the photo below the same?

enter image description here

i.e. Are (1) and (2) in the above picture the same?


Solution

  • There is an algorithm to determine whether they are equal:

    1. Construct NFA-lambdas corresponding to each RE using Kleene's theorem
    2. Construct DFAs for each using the subset/powerset construction
    3. (optional) Minimize the DFAs using a standard DFA minimization algorithm.
    4. Construct DFAs for L(M1) \ L(M2) and L(M2) \ L(M1) using the Cartesian Product Machine construction
    5. (Optional) Minimize these CPMs.
    6. Determine whether each one accepts any strings by testing all strings over alphabet E of size no greater than |Q| (works due to the pumping lemma for regular languages)

    No novelty or genius is required; you could write a program to do this (although, in practice, using the powerset construction can be unwieldy, and failing to minimize at both steps can be costly).

    EDIT: Yes, those DFAs are the same. The first is just a shorthand notation for the second.