Search code examples
regular-languagedfa

Using Closure Properties to prove Regularity


Here's a homework problem:

Is L_4 Regular?
Let L_4 = L*, where L={0^i1^i | i>=1}.

I know L is non-regular and I know that Kleene Star is a closed operation, so my assumption is that L_4 is non-regular.

However my professor provided an example of the above in which L = {0^p | p is prime}, which he said was regular by proving that L* was equal to L(000* + e) by saying each was a subset of one another (e in this case means the empty word).

So his method involved forming a regex of 0^p, but how I can do that when I essentially have one already?


Solution

  • Regular languages are closed under Kleene star. That is, if language R is regular, so is R*.

    But the reasoning doesn't work in the other direction: there are nonregular languages P for which P* is actually regular.

    You mentioned one such P in your question: the set of strings 0^p where p is prime.

    It is easy to use the pumping lemmas for regular and context-free languages to show that P is at least context-sensitive. However, P* is equivalent to the language 0^q, where q is the sum of zero or more primes. But this is true for q=0 (the empty string) and any q>=2, so P* can be recognized with a 3-state DFA, even though P itself is not regular.

    So L being context-free has no bearing on whether your L_4 = L* is regular or not. If you can construct a DFA that recognizes L_4, as I did for P* above, then clearly it's regular. In the process of trying to find a DFA that works, you'll probably see some pattern emerge that can be used as the basis for a pumping argument. The Myhill-Nerode theorem is another approach to proving a language non-regular, and is useful if the language lends itself to analysis of prefixes and distinguishing extensions. If the language can be decomposed into a finite set of equivalence classes under a certain relation, then it can be recognized with a DFA containing that many states.

    Edit: For anyone wondering whether OP's example L_4 is regular or not...it's not, which can be proved using the pumping lemma for regular languages.

    Assume L_4 is regular, with "pumping length" P. Consider the string w=0P1P, which is an element of L_4. We need to decompose it into the form w=xyz, with |y| >= 1 and |xy| <= P. Any choice of xy fulfilling these conditions will consist of all zeroes. But then any string w' = xynz with n != 1 will have mismatched counts of 0s and 1s, and therefore cannot be an element of L_4. So the pumping lemma does not hold, and L_4 cannot be regular.