Search code examples
c++parsingnewlineragel

Issues with parsing 'newline' in Ragel


I am using Ragel and C++ as host to parse a few commands. The commands are read from a file and then parsed using the following syntax.

The syntax of the command is as follows:

Signal_representation {
[<signal_encoding_type_name>: <signal_name> ([, <signal_name>]) ;]
}

Now here in the above syntax there can be a NEWLINE after : or after the signal name followed by comma ,

example: 

#1  Signal_representation{
#2      Activity:
#3        Button_Active,
#4        Buttons_Inactive;
#5      Switch:
#6        Horn,
#7        Up_Arrow,
#8        Right_Arrow,
#9        Down_Arrow,
#10       Audio,            
#11       Day_Night, Sleep, SWM_Off;
#12  }

Here is the Ragel grammar that I follow for parsing above commands.

action string_error {
    cout << " ERROR::Expected string characters at line = "<< g_ReadLineNbr << endl;
}

action incr_Count {
    //increment count to trace back and retrieve the string encountered
    iGenrlCount++;
}

action getString {
    std::stringstream str;
    while(iGenrlCount > 0)
    {
    str << *(p - iGenrlCount);
    iGenrlCount--;
    }
    str >> GeneralStr; //push the values
}

action getSglEncTyp {
    cout << "Enc type = " << GeneralStr<< endl;
    GeneralStr.clear();
}

action getSgnlName {
    cout << "Signal name = " << GeneralStr<< endl;
    GeneralStr.clear();
}

action getSgnlRepr {
    cout << "parse ok" << endl;
}

action parse_error {
    cout << "parse failed" << endl;
}

// my definition of Ragel grammar

OPEN_BRACES = '{';
BARE_STRING = ([a-zA-Z0-9_\.\-]+) $incr_Count %getString >!(string_error);
CLOSE_BRACES = '}';

//parsing starts from the  parameter <signal_encoding_type_name>

signal_repr =  (space* BARE_STRING%getSglEncTyp space* ':' space* BARE_STRING%getSgnlName (space* ',' space* BARE_STRING%getSgnlName)* space* ';' space*)%/getSgnlRepr $!parse_error;

main := signal_repr | space* ;



//global variables in C++ prgram visible across all actions
 string GeneralStr;
 int iGenrlCount = 0;

The issue I am facing is with the new line encountered in the file. For the example given above, I get the following error ERROR::Expected string characters at line = 2

As per the Ragel 6.10 document the FSM space must detect the following

Whitespace. [\t\v\f\n\r ]

I have also tried replacing space with the following FSM:

_CR = ('\r' | '\n' | '\r\n' );

but even the above one does not work.

Has anyone faced a similar situation? I see some questions on Stackoverflow regarding Ragel and newline, but it doesn't seem to address the issue here in particular.


Solution

  • you must separate the logic in two parts instead of one.

    eg. not tested

    signal_repr_single_line = (space* BARE_STRING%getSglEncTyp space* ':' space* BARE_STRING%getSgnlName (space* ',' space* BARE_STRING%getSgnlName)* space* ';' space*)%/getSgnlRepr $!parse_error;
    signal_repr_multi_line = (space* BARE_STRING%getSglEncTyp space* ':' (space* BARE_STRING%getSgnlName(space* ','{1} space*)));
    
    signal_repr = signal_repr_single_line | signal_repr_multi_line;