Search code examples
algorithmlanguage-agnosticimplementationfsm

Is using a finite state machine a good design for general text parsing?


I am reading a file that is filled with hex numbers. I have to identify a particular pattern, say "aaad" (without quotes) from it. Every time I see the pattern, I generate some data to some other file.

This would be a very common case in designing programs - parsing and looking for a particular pattern.

I have designed it as a Finite State Machine and structured structured it in C using switch-case to change states. This was the first implementation that occured to me.

  • DESIGN: Are there some better designs possible?
  • IMPLEMENTATION: Do you see some problems with using a switch case as I mentioned?

Solution

  • A hand-rolled FSM can work well for simple situations, but they tend to get unwieldy as the number of states and inputs grows.

    There is probably no reason to change what you have already designed/implemented, but if you are interested in general-purpose text parsing techniques, you should probably look at things like regular expressions, Flex, Bison, and ANTLR.