Search code examples
regexhaskellfinite-automata

Can I save regex state for next input?


I parse TCP packets with some data and this data can divided in several packets. I can't save packets, so after passing first packet I can't read it again. I need to know if there is my pattern in data, when, for example, first packet contains "hello wo" and second contains "rld!" and I want to know if there is the "world" in sequence.

For simply example, I have two files and I can search in them:

-- file: Seq.hs
import Text.Regex.TDFA
import System.Environment

main = do
    args <- getArgs
    inpStr1 <- readFile (args !! 0)
    putStrLn $ show (inpStr1 =~ "foo" :: Bool)
    inpStr2 <- readFile (args !! 1)
    putStrLn $ show (inpStr2 =~ "foo" :: Bool)

Can I save state of FA after processing inpStr1 to continue searching with inpStr2?


Solution

  • Instead of regular expressions I'd suggest you to use attoparsec. It's fast, robust and allows incremental input:

    A fast parser combinator library, aimed particularly at dealing efficiently with network protocols and complicated text/binary file formats.

    Regular expressions get ugly easily and in particular in Haskell, using a typed combinator parsing library makes things much clearer.

    There is also package network-attoparsec:

    Utility functions for running a parser against a socket, without the need of a bigger framework such as Pipes or Conduit.