Search code examples
bisonyacc

Lex and YACC (FLEX and Bison)


Hello I need to understand flex and bison in order to rewrite to QRegExp the following question arose. If the token is suitable for several teams at once, then how will YACC / BISON act? For example, there is a FLOAT_NUM token, there are two commands:

  • the first one needs two tokens (STARTED and FLOAT_NUM)

     title: STARTED FLOAT_NUM
        {
         .
         .
         .
        }
    
  • the second command only needs the FLOAT_NUM token

      my_type: FLOAT_NUM
         {
          .
          .
          .
         }
    

Accordingly, the following expression comes to YACC/Bison:

STARTED FLOAT_NUM

Am I correct in understanding that only the 'title' command will work, and the 'my_type' command will work only when one FLOAT_NUM arrives and nothing else?


Solution

  • Yacc generates a state machine that takes one token at the time and decides on the next state based on the current state and the token.

    For example, in your case, if it get token STARTED it probably goes into a state that could be described as "I am in the terminal title at the position 1 and I expect token FLOAT_NUM now". To answer the question what happens if a token X arrives, you always need to first know in which state the parser is currently.

    It's hard to work with an incomplete example, so I'll assume your complete program looks like this:

    %token STARTED FLOAT_NUM
    
    %start main
    
    %%
    
    main: title | my_type {}
    title: STARTED FLOAT_NUM {}
    my_type: FLOAT_NUM {}
    

    Parsing starts at the beginning of the start terminal (main in this case) and it ends when it reaches the end of the start terminal.

    So, in this case the start terminal can consist of either title or my_type, so the parser expect that the first token is either STARTED or FLOAT_NUM. When it gets either of them, the parser can unambiguously know if it is currently in title or my_type. If the example would be different and they started with the same token, the parser would just go into a state "either title or my_type, give me second token now" and the parsing would continue.

    You can see what are all the generated states and how they change into each other, by running bison with the option -r all which creates a human-readable file with a complete report of the generated state machine.

    That are the complete basics. I hope it helped.

    If this isn't enough, I suggest to read the bison manual, especially the section describing the state machine.