I am working on a console application. To create an interpreter I'm using Flex and Bison. I created a grammar but I am getting an "Syntax Error" without any other explanation every time I try with a string. The string that I am trying with is: MKDISK -PATH=./home/erick/disk.dk -u=k -size=1000\n
I know that there is an issue with the production
comando : MKDISK lista_param
{
printf("Mkdisk con parametros\n");
Mkdisk m;
m.agregarParametros($2);
m.assignParameters();
}
;
Because I noticed that if I add a production with out lista_param, just MKDISK, it works and the parser will always go for that production, event If the string matches the otherone.
parser.yy:
%skeleton "lalr1.cc" /* -*- C++ -*- */
%defines
%define api.parser.class {Parser}
%define api.token.constructor
%define api.value.type variant
%define parse.trace
%define parse.error verbose
%param { Driver& driver }
%code requires
{
class Driver;
class Comando;
class Parametro;
class Mkdisk;
}
%{
using namespace std;
#include <stdio.h>
#include <iostream>
#include <string>
#include <vector>
#include "driver.h"
%}
/******* TERMINALES ********/
%token <std::string> NUM"NUM" SIZE"SIZE" F"F" PATH"PATH" U"U" BF"BF" FF"FF" WF"WF" K"K" M"M" RUTA"RUTA" MKDISK"MKDISK" RMDISK"RMDISK"
%token GUION"GUION" IGUAL"IGUAL"
/******* NO TERMINALES ********/
%start inicio;
%type <Parametro> parametro
%type <Comando> comando
%type <std::vector<Parametro>> lista_param
%type <std::string> atributo nom_param
%%
inicio : lista_comandos "\n"
{
printf("Primer nivel del arbol\n");
}
;
lista_comandos : lista_comandos comando
{
printf("Lista de comandos\n");
}
| comando
{
printf("Comando individual\n");
}
;
comando : MKDISK lista_param
{
printf("Mkdisk con parametros\n");
Mkdisk m;
m.agregarParametros($2);
m.assignParameters();
}
;
lista_param : lista_param parametro
{
printf("Lista de parametros\n");
$$=$1;
$$.push_back($2);
}
| parametro
{
printf("parametro individual\n");
vector<Parametro> params;
params.push_back($1);
$$ = params;
}
;
parametro : GUION nom_param IGUAL atributo
{
printf("Quinto nivel del arbol\n");
Parametro param;
param.setNombre($2);
param.setValor($4);
$$ = param;
}
;
nom_param : SIZE { $$=$1; }
| F { $$=$1; }
| PATH { $$=$1; }
| U { $$=$1; }
;
atributo : NUM { $$=$1; }
| BF { $$=$1; }
| FF { $$=$1; }
| WF { $$=$1; }
| K { $$=$1; }
| M { $$=$1; }
| RUTA { $$=$1; }
;
%%
void yy::Parser::error( const std::string& error){
std::cout <<"\e[0;31m"<< error << std::endl;
}
lexer.l
%{
#include <stdio.h>
#include <string>
#include "driver.h"
#include "parser.tab.hh"
%}
%option case-insensitive
%option noyywrap
%option outfile="scanner.cc"
DIGIT [0-9]
NUM {DIGIT}+("."{DIGIT}+)?
PATH \"?(\/([^\/\n])*)+\"?
%%
"MKDISK" { return yy::Parser::make_MKDISK(yytext); }
"RMDISK" { return yy::Parser::make_RMDISK(yytext); }
"SIZE" { return yy::Parser::make_SIZE(yytext); }
"F" { return yy::Parser::make_F(yytext); }
"PATH" { return yy::Parser::make_PATH(yytext); }
"U" { return yy::Parser::make_U(yytext); }
{NUM} { return yy::Parser::make_NUM(yytext);}
"BF" { return yy::Parser::make_BF(yytext); }
"FF" { return yy::Parser::make_FF(yytext); }
"WF" { return yy::Parser::make_WF(yytext); }
"K" { return yy::Parser::make_K(yytext); }
"M" { return yy::Parser::make_M(yytext); }
{PATH} { return yy::Parser::make_RUTA(yytext); }
"-" { return yy::Parser::symbol_type(); }
"=" { return yy::Parser::symbol_type(); }
[[:blank:]] {}
. { printf("Caracter no reconocido: %s\n",yytext);}
%%
void Driver::runScanner(){
yy_flex_debug = false;
yyin = fopen (file.c_str (), "r");
if(yyin == NULL){
printf("No se encontro el archivo de entrada");
exit(1);
}
}
void Driver::runScannerWithText(std::string text){
yy_flex_debug = true;
YY_BUFFER_STATE buffer = yy_scan_string(text.c_str());
}
void Driver::closeFile(){
fclose(yyin);
}
Although you don't include driver.cc
or driver.hh
in your question, I suspect that they are adapted from the example C++ code in the Bison manual. That code allows you to enable either scanner or parser tracing using command line flags. If you didn't include that part of the example code, I strongly suggest that you put it back in and enable the tracing. You'll find it much easier to see what is going on.
The immediate problem here is that when your scanner sees a -
, it executes the action:
"-" { return yy::Parser::symbol_type(); }
which sends an empty token to the parser. Empty tokens are not valid tokens, so the parser complains. Here's the trace (created by invoking the executable with the flag -p
):
Starting parse
Entering state 0
Stack now 0
Reading a token
MKDISK -PATH=./home/erick/disk.dk -u=k -size=1000
Next token is token MKDISK (MKDISK)
Shifting token MKDISK (MKDISK)
Entering state 1
Stack now 0 1
Reading a token
Next token is empty symbol <====== AQUÍ
syntax error
Error: popping token MKDISK (MKDISK)
Stack now 0
Stack now 0
Apparently, bison does not even attempt to create a meaningful error message when it encounters a problem like that.
In addition to fixing the -
and =
actions, you need to do something about:
inicio : lista_comandos "\n"
Although legal, that cannot work. The scanner doesn't even respond to newline characters (not even by declaring them illegal) because no scanner rule applies. (I like to use %option nodefault
so that flex will warn me when I've missed some possible input.) But even if the scanner did detect a newline character, it has no way of knowing how to send the parser a "\n"
, because that token has no name. Since the scanner can't send the token, the rule can never match.
You'll have to create a named newline token and use it in that rule instead of "\n"
. And, of course, you'll have to get the scanner to send that token when it reads a newline.
By the way, there's very little point giving tokens an alias which is exactly the same as the token name. The point of aliases is to provide better token names in error messages; if you don't give a token an alias, the parser will use the token name as is. So the only point of providing an alias is if it is more readable than the token name. Aliases do not have any other use.