grammar context-free-grammar computation-theory context-sensitive-grammar

Can a context-sensitive grammar have an empty string?

In one of my cs classes they mentioned that the difference between context-free grammar and context-sensitive grammar is that in CSG, then the left side of the production rule has to be less or equal than the right side.

So, one example they gave was that context-sensitive grammars can't have an empty string because then the first rule wouldn't be satisfied.

However, I have understood that regular grammars are contained in context-free, context-free are contained in context-sensitive, and context-sensitive are contained in recursive enumerable grammars.

So, for example if a grammar is recursive enumerable then is also of the type context-sensitive, context-free and regular.

The problem is that if this happens, then if I have a context-free grammar that contains an empty string then it would not satisfy the rule to be counted as a context-sensitive, but then a contradiction would occur, because each context-sensitive is context-free.

Solution

Empty productions ("lambda productions", so-called because λ is often used to refer to the empty string) can be mechanically eliminated from any context-free grammar, except for the possible top-level production S → λ. The algorithm to do so is presented in pretty well every text on formal language theory.

So for any CFG with lambda productions, there is an equivalent CFG without lambda productions which generates the same language, and which is also a context-sensitive grammar. Thus, the prohibition on contracting rules in CSGs does not affect the hierarchy of languages: any context-free language is a context-sensitive language.

Chomsky's original definition of context-sensitive grammars did not specify the non-contracting property, but rather an even more restrictive one: every production had to be of the form αAβ→αγβ where A is a single symbol and γ is not empty. This set of grammars generates the same set of languages as non-contracting grammars (that was also proven by Chomsky), but it is not the same set. Also, his context-free grammars were indeed a subset of context-sensitive grammars because by his original definition of CFGs, lambda productions were prohibited. (The 1959 paper is available online; see the Wikipedia article on the Chomsky hierarchy for a reference link.)

It is precisely the existence of a non-empty context -- α and β -- which leads to the names "context-sensitive" and "context-free"; it is much less clear what "context-sensitive" might mean with respect to an arbitrary non-contracting rule such as AB→BA . (Note 1)

In short, the claim that "every CFG is a CSG" is not technically correct given the common modern usage of CFG and CSG, as cited in your question. But it is only a technicality: the CFG with lambda productions can be mechanically transformed, just as a non-contracting grammar can be mechanically transformed into a grammar fitting Chomsky's definition of context-sensitive (see the Wikipedia article on non-contracting grammars).

(It is also quite common to allow both context-sensitive and context-free languages to include the empty string, by adding an exception for the rule S→λ to both CFG and CSG definitions.)

Notes

In Chomsky's formulation of context-free and -sensitive grammars, it was unambiguous what was meant by a parse tree, even for a CSG; since Chomsky is a linguist and was seeking a framework to explain the structure of natural language, the notion of a parse tree mattered. It is not at all obvious how you might describe the result of AB → BA as a parse tree, although it is quite clear in the case of, for example, a A b → B.