I want to use flex to transform a string based on simple rules. I have rules like the first character stays the same and the second and third characters might change. Like if the second character was a letter, it becomes the number listed in the rules below. If the third is a digit, it becomes a certain letter.
%%
/*^[a-z] {char *yycopy = strdup( yytext ); unput(yycopy[0]);}*/
[ajs] {putchar('1');}
[bkt] {putchar('2');}
[clu] {putchar('3');}
[dmv] {putchar('4');}
[1] {putchar('j');}
[2] {putchar('k');}
[3] {putchar('l');}
[4] {putchar('m');} /*more number rules till 9*/
%%
int yywrap(void){return 1;}
int main( int argc, char **argv )
{
++argv, --argc; /* skip over program name */
if ( argc > 0 )
yyin = fopen( argv[0], "r" );
else
yyin = stdin;
while (yylex());
}
If there are different rules for characters in different positions within the string, how can I use start conditions to change a particular character (i.e. the rules for the second and third character are different).
You switch start condition by using the BEGIN action. Flex never automatically changes start condition, so you when you need to return to the initial start condition (called INITIAL
), you have to do so explicitly (BEGIN(INITIAL)
).
You need to declare start condition names in the (f)lex prologue, usually with the %x
command. (%s
is also possible but with different semantics. See the Flex manual for details.)
You indicate that a start condition applies to a rule by starting the rule with a start condition name in angle brackets. You can put more than one start condition inside the angle brackets; separate them with commas and don't use spaces. Don't put a space after the angle brackets either; they are part of the pattern and (f)lex patterns cannot include unquoted space characters.
BEGIN
is a macro and it does not require parentheses around the start condition name, but I suggest always using them anyway, so you don't have to worry about what the macro expands to. Start condition names are small integers (either enum
constants or preprocessor macros) but nothing guarantees their value, so don't make assumptions.
That's about it. So you could implement your astro numerological codifier with:
%x SECOND THIRD REST
%%
[a-z] ECHO; BEGIN(SECOND);
<SECOND>[ajs] putchar('1'); BEGIN(THIRD);
/* More SECOND rules */
<THIRD>1 putchar('j'); BEGIN(REST);
/* More THIRD rules */
<*>.*\n? ECHO; BEGIN(INITIAL);
(I deliberately did not add any <REST>
rules beacause the fallback at the end covers it. I also deliberately left out the anchor in the first rule because my rules guarantee that the INITIAL
start condition is 9nly in force at the beginning of a line. See the last rule. The last rule specifies an optional newline in case the file does not end with a newline, which occasionally happens although it's technically invalid.)