Search code examples
c++bisonflex-lexer

Flex & Bison C++


I want to start a Flex & Bison translator which will read from a file given and output to another file while understanding what we gave him in the input. For example if i give "This is a string" 12 4.5, the output file will be

String > "Ths is a string"

Space > *

Integer > 12

Space > *

Float > 4.5

The problem is that i am trying to under the base below all of this and i have started from the point where i am reading the input file and output file and opening them. I am working on visual studio so i have added the command line arguments to be read exe input.txt output.txt and i am managing them in the main class in Grammar.y file. After that i am trying to do some regular expressions with what to return if the yylex(); functions finds something. I am providing code but i have 2 problems so far.

Firstly the compiler spits out that i have not declared the tokens but i have putted them in the .y file as i have provided below.

Secondly i fill like i dont know what i am doing wrong and it does not work at all so any advice will be helpful.

This is my Grammar.l file

%option noyywrap

%{
    #include <iostream>
    #include <stdio.h>
    #include <stdlib.h>
    #include "Grammar.tab.h"
    #define YY_DECL int yylex(yy::parser::semantic_type *yylval)
    FILE *fout;

    using namespace std;
%}

%%

[0-9]+                          { yylval->ival = atoi(yytext); return INTEGER; }
[0-9]+"."[0-9]+ | "."?[0-9]?    { yylval->fval = atof(yytext); return FLOAT; }
[a-zA-Z0-9]+                    { yylval->sval = yytext; return STRING; }
" "*                            { return SPACE; }
"\t"*                           { return TAB; }

%%

This is the Grammar.y file

%language "C++"
%start root

%{
    #include <stdio.h>
    #include <stdlib.h>
    #include "Grammar.tab.h"
    #define SIZE 512

    using namespace std;

    extern "C" int yylex();
    extern "C" FILE *yyin;
    extern "C" FILE *fout;

    extern int yylex(yy::parser::semantic_type *yylval);
%}

%union{
    int ival;
    float fval;
    char* sval;
}

%token <ival> INTEGER
%token <fval> FLOAT
%token <sval> STRING
%token SPACE
%token TAB

%%

root : ;

%%

void main( int argc, char ** argv){

    if ( argc < 4 ){
        printf("\nError!!! Missing Command line arguments\nUsage exe <inputfile> <outputfile>");
        exit(1);
    }
    else{
        fopen_s(&yyin, argv[3],"r");
        if (yyin == NULL) {
            printf("\033[0;31mError oppening input file.\033[0m");
        }

        fopen_s(&fout, argv[4],"r");
        if (fout == NULL) {
            printf("\033[0;31mError oppening output file.\033[0m");
        }

        do
        {
            yylex();
        }while(!feof(yyin));
        fclose(yyin);
    }
    fclose(fout);
}

namespace yy{
    void parser::error (const location_type& loc, const std::string& msg){
        std::cerr << "error at " << loc << ": " << msg << std::endl;
    }
}

Solution

  • When you request a C++ parser, bison keeps the token types out of the global namespace. This makes the usage quite different from most of the examples you'll find on the internet, which assume the C interface.

    So instead of just using INTEGER, for example, you need to specify the full name:

    [0-9]+                          { yylval->ival = atoi(yytext);
                                      return yy::parser::token::INTEGER; }
    

    You could shorten that a bit with a using directive in the prologue.

    Your compiler will also complain about the call to yylex inside your main function. Note that you have declared yylex as:

    extern int yylex(yy::parser::semantic_type *yylval);
    

    which means that it expects a single argument which is a pointer to a yy::parser::semantic_type (i.e., the union described by your %union declaration). In order to call the function, then, you need a pointer to such an object, which means that you need an instance of that object to point at:

        yy::parser::semantic_type yylval;
        int token;
        do
        {
            token = yylex(&yylval);
            /* Here you need to do something in order to print
             * the token and possibly the associated semantic value.
             * So you'll probably need something like this:
             */
            switch(token) {
              case yy::parser::token::INTEGER {
                fprintf(fout, "INTEGER: %d\n", yylval->ival);
                break;
              }
              // ...
              case 0: break;
            }            
        } while(token);
    

    Note that I changed the feof test so that the loop instead terminates when yylex returns 0, which is how yylex signals that it has reached the end of input.