Are there such a thing as LL(0) parsers?
LL(0) parsers do look at the tokens, but they don't decide which productions to apply upon them. They just determine if the sequence belongs to the language or not. This means that every non-terminal symbol must have a single right-hand side and that there may be no recursion.
G == ID name lastname
name == STRING
lastname == STRING
# lexer rules
# -----------
ID == [0-9]+
STRING == <unicode>+
Note that, as mentioned by @280Z28, a separate lexer is needed to deal with the variable length parts (ID
and STRING
), or the grammar will not be LL(0)
.
The sequence of productions to apply to parse an input with that grammar requires zero lookahead.
A lookahead is required for determinism when there's more than one production that could be applied after part of the given input sequence has been parsed.
In theory, a grammar generates a language, and, in that, ambiguity (having more than one way to derive a given phrase) is fine. In parsing, having one-and-only-one way is in the way of semantics (meaning), and it is what we want.
In parsing of programming languages, a lookahead is the information required to know which grammar production to use next.
In LL(0) languages, there are no choices, so the input sequence is either accepted and parsed, or rejected.
The k in LR(k) refers to the number of lookahead tokens. You always use at least one token in order to determine the action to perform. The Wikipedia page page has some more information on this.
Intuitively, the extra lookahead symbols let you make reduction choices with more information, so they allow larger classes of grammars to be expressed without conflicts.