In shift reduce parsing why the handle always eventually appear on top of the stack and never inside?

I was going through the text Compilers Principles, Techniques and Tools by Ullman et. al where I came across the excerpt where the authors try to justify why stack is the best data structure of shift reduce parsing. They said that it is so because of the fact that

dagger "The handle will always eventually appear on top of the stack, never inside."

The Excerpt

This fact becomes obvious when we consider the possible forms of two successive steps in any rightmost derivation. These two steps can be of the form

In case (1), A is replaced by , and then the rightmost nonterminal B in that right side is replaced by . In case (2), A is again replaced first, but this time the right side is a string y of terminals only. The next rightmost nonterminal B will be somewhere to the left of y.

Let us consider case (1) in reverse, where a shift-reduce parser has just reached the configuration

The parser now reduces the handle to B to reach the configuration

in which is the handle, and it gets reduced to A

In case (2), in configuration

the handle is on top of the stack. After reducing the handle to B, the parser can shift the string xy to get the next handle y on top of the stack,

Now the parser reduces y to A.

In both cases, after making a reduction the parser had to shift zero or more symbols to get the next handle onto the stack. It never had to go into the stack to find the handle. It is this aspect of handle pruning that makes a stack a particularly convenient data structure for implementing a shift-reduce parser.

My reasoning and doubts

Intuitively this is how I feel that the statement in dagger can be justified

If there is an handle on the top of the stack, then the algorithm, will first reduce it before pushing the next input symbol on top of the stack. Since before the push any possible handle is reduced, so there is no chance of an handle being on the top of the stack and then pushing a new input symbol thereby causing the handle to go inside the stack.

Moreover I could not understand the logic the authors have given in highlighted portion of the excerpt justifying that the handle cannot occur inside the stack, based on what they say about B and other facts related to it.

Please can anyone help me understand the concept.

Solution

The key to the logic expressed by the authors is in the statement at the beginning (emphasis added):

This fact becomes obvious when we consider the possible forms of two successive steps in any rightmost derivation.

It's also important to remember that a bottom-up parser traces out a right-most derivation backwards. Each reduction performed by the parser is a step in the derivation; since the derivation is rightmost the non-terminal being replaced in the derivation step must be the last non-terminal in the sentential form. So if we write down the sequence of reduction actions used by the parser and then read the list backwards, we get the derivation. Alternatively, if we write down the list of productions used in the rightmost derivation and then read it backwards, we get the sequence of parser reductions.

Either way, the point is to prove that the successive handles in the derivation steps correspond to monotonically non-retreating prefixes in the original input. The authors' proof takes two derivation steps (any two derivation steps) and shows that the end of the handle of the second derivation step is not before the end of the handle of the first step (although the ends of the two handles may be at the same point in the input).