I have a lexer, written with ply. Lexer has two states: string and macros. Macros is a special expression put in curly braces. Lexer is very simple:
states = (
('macro', 'exclusive'),
)
t_STRING = [^{] # any char but curly brace
def t_lcurlybrace(t):
r'{'
t.lexer.begin('macro')
... some other tokens for macro state
def t_macro_rcurlybrace(t):
r'}'
t.lexer.begin('INITIAL')
So basically it works like this:
Two plus two is {2 + 2}
Lexer produces tokens like STRING, NUMBER, OPERATOR, NUMBER for this line.
But I'm stuck with error handling. If input is
Two plus two is {2 + 2
lexer produces the same stream of tokens as before. The only difference is the state of lexer in the end (macro, not INITIAL).
I want to raise an error in such case, but I can't find any built in hooks in lex for such task. Now my guess is to wrap a lexer in a wrapper, which will check the state of the lexer, when all input is consumed.
UPDATE:
I tried to use t_eof like this: def t_eof(t): if t.lexer.current_state() != 'INITIAL': raise Exception('Unbalanced brackets')
but it didn't work.
UPDATE2:
t_eof must be defined as t_macro_eof, as EOF is reached during "macro" state, so it can be done like:
def t_macro_eof(t):
raise Exception('Unbalanced brackets')
You can check the state in the EOF handler