Search code examples
regexalgorithmflex-lexerfinite-automatanfa

How does flex match the beginning of line anchor?


I've always wondered how the beginning of input anchor (^) was converted to a FSA in flex. I know that the end of line anchor ($) is matched by the expression r/\n where r is the expression to match. How's the beginning of input anchor matched? The only solution I see is to use start conditions. How can it be implemented in a program?


Solution

  • End of line marker $ is different from \n in that it matches EOF as well, even if the end-of-line marker \n or \r\n is not found at the end of the file.

    I did not look at flex's implementation, but I would implement both ^ and $ using boolean flags. The ^ flag would be initially set, then reset to false after the first character in a line, then set back to true after the next end-of-line marker, and so on.