I can solve 4.2.3 a~e by myself, but f is too hard for me and I can't even find the answer using google.
Design grammars for the following languages:
(f) The set of all strings of 0s and 1s of the form xy, where x ≠ y and x and y are of the same length
The interesting thing about this language is its relationship with
Ldup = { ωω | ω ∈ Σ*}
which is the classic example of a language which is not context-free. But the language from the exercise
Ldiff = { γζ | γ ≠ ζ ∧ γ ∈ Σ* ∧ ζ ∈ Σ* }
has the property that
Ldiff ⋃ Ldup ⋃ Lodd = Σ* and Ldup, Ldiff, Lodd are disjoint.
where
Lodd = { ω | |ω| is odd ∧ ω ∈ Σ*}
In other words, if you split a string into two equal-length parts, the two parts are either equal or unequal, and if you can't split the string that way, its length is odd.
Lodd is clearly context free -- in fact, it is regular -- so the fact that Ldiff is context-free leads to the conclusion that the complement of a context-free language can be not context-free.
Given that, it's natural to think about Ldiff as being composed of the two equal-length pieces γ and ζ. But that frames the problem in a way that makes it difficult, if not impossible, to solve. As with many maths problems (and, I dare say, real-world problems), the solution has to start by reframing the problem to reveal its underlying simplicity. ("Thinking outside of the box", to use the famous cliché.)
Let's say that γ and ζ are both of length n. Since they are different, they must differ in at least one position. Let's say that k is such a position, so that γk ≠ ζk.
Now, let's try to characterise the pairs of strings which satisfy that constraint. All of the symbols other than the ones at position k are irrelevant, so what we end up with is:
ΣkγkΣn-k-1 ΣkζkΣn-k-1
That's obviously the same as
ΣkγkΣk Σn-k-1ζkΣn-k-1
and letting j = n−k−1, a = γk and b = ζk, we get
ΣkaΣk ΣjbΣj, a ≠ b
j and k and the two symbols a and b are all arbitrary in that expression. So in effect, we've described the set of strings which can be divided into two odd-length parts such that the symbols in the middle of the two parts differ. With binary strings, there are only two pairs of symbols which differ: <0, 1>
and <1, 0>
. So that leads to the following grammar:
W → 0 | 1 /* Wildcard */
A → 0 | W A W /* Strings with a 0 in the middle */
B → 1 | W B W /* Strings with a 1 in the middle */
S → A B | B A /* Ldiff */