S -> Abc|aAcb
A -> b|c|ε
I think the first one is LL(1)
S -> aAS|b
A -> a|bSA
But the problem is second one. There's no conflict problem, but I think it doesn't satisfy right-recursion. I'm not sure about those problems.
The first grammar is not LL(1), because of several conflicts, as for example for input bc
, an LL parser will need 2 tokens of look-ahead to parse it:
S
A
b
c
A
b
is expected, but the current token is c
A
b
A
without recognizing anything (because of the epsilon)b
character after the reference to rule A
that was "jumped" without the use of any tokenc
c
You have a similar case in the second alternative of S
for an input acb
. The grammar is not ambiguous, because in the end there is only one possible syntax tree. Its not LL(1), as in fact is LL(2).
The second grammars is deterministic - there is only one way to parse any input that is valid according to the grammar. This means that it can be used for parsing by an LL(1) parser.
I have made a tool (Tunnel Grammar Studio) that detects the grammar conflicts for nondeterministic grammars and generates parsers. This grammar in ABNF (RFC 5234) like syntax is:
S = 'a' A S / 'b'
A = 'a' / 'b' S A
The right recursion by itself does not create ambiguities inside the grammar. One way to have a right recursion ambiguity is to have some dangling element as in this grammar:
S = 'c' / 'a' S 0*1 'b'
You can read it as: rule S
recognizes character c
or character a
followed by rule S
itself and maybe (zero or one time) followed by character b
.
The grammar up has a right recursion related ambiguity because of the dangling b
character. Meaning that for an input aacb
, there is more then one way to parse it: recognize the first a
in S
; enter into S
again; recognize the second a
; enter again into S
and recognize c
; exit S
one time; then there are two choices:
case one) recognize the b
character, exit S
two times or case two) first exit S
one time and then recognize the b
character. Both cases are these (screen shots from the visual debugger of TGS):
This grammar is thus ambiguous (i.e. not LL(1)) because more then one syntax tree can be generated for some valid inputs. For this input the possible trees are only two, but for an input aaacb
there are three trees, as there are three trees for aaacbb
, because of the possible 3 places where you can 'attach' the two b
characters, two of this places will have b
, and one will remain empty. For input aaacbbb
there is of course only one possible syntax tree, but the grammar is defined to be ambiguous if there is at least one input for which there is more then one possible syntax tree.