Search code examples
parsingcompiler-theory

Are there such a thing as LL(0) parsers?



I saw a question somewhere asking the difference between LL(0) and LR(0) parsers. Is there such a thing as LL(0) parsers? If so, how do they parse without looking at any token?


Solution

  • LL(0) parsers do look at the tokens, but they don't decide which productions to apply upon them. They just determine if the sequence belongs to the language or not. This means that every non-terminal symbol must have a single right-hand side and that there may be no recursion.

    G == ID name lastname
    name == STRING
    lastname == STRING
    
    # lexer rules
    # -----------
    ID == [0-9]+
    STRING == <unicode>+
    

    Note that, as mentioned by @280Z28, a separate lexer is needed to deal with the variable length parts (ID and STRING), or the grammar will not be LL(0).

    The sequence of productions to apply to parse an input with that grammar requires zero lookahead.

    A lookahead is required for determinism when there's more than one production that could be applied after part of the given input sequence has been parsed.

    In theory, a grammar generates a language, and, in that, ambiguity (having more than one way to derive a given phrase) is fine. In parsing, having one-and-only-one way is in the way of semantics (meaning), and it is what we want.

    In parsing of programming languages, a lookahead is the information required to know which grammar production to use next.

    In LL(0) languages, there are no choices, so the input sequence is either accepted and parsed, or rejected.