I'm thinking of using Ragel to generate a lexer for NMEA GPS data in an embedded system. I would have an arbitrary-sized buffer into which I'd read blocks of data from a UART, and for each read I'd pass that data into the lexer.
I'd like to be able to extract particular fields, but the problem is that I have no guarantee that an entire field is present in a block of data. Any field might be split across two reads, so setting pointers to the start and end of the field might leave the start pointer at the end of the previous (now overwritten) buffer, and the end pointer before it.
One solution that springs to mind is to use a '$' action on each field to push the characters one-by-one into another bit of memory (probably a struct field). Is that the best approach?
For what it's worth, I ended up with this:
%%{
machine nmea;
action store { *wptr = fc; }
action append { *wptr++ = fc; }
action term { *wptr++ = 0; }
integer = digit+;
float = digit+ '.' digit+;
rmc = '$GPRMC,'
float ','
[AV] >{ wptr = &loc.valid; } $store ','
float? >{ wptr = loc.lat; } $append %term ','
[NS]? >{ wptr = &loc.ns; } $store ','
float? >{ wptr = loc.lng; } $append %term ','
[EW]? >{ wptr = &loc.ew; } $store
print*
'\n' >{ printf("%c, %s, %c, %s, %c\n", loc.valid, loc.lat, loc.ns, loc.lng, loc.ew); }
;
main := any* rmc;
}%%