Search code examples
pythonply

How to raise error for unbalanced braces, when parsing using PLY?


I have a lexer, written with ply. Lexer has two states: string and macros. Macros is a special expression put in curly braces. Lexer is very simple:

states = (
    ('macro', 'exclusive'),
)

t_STRING = [^{]   # any char but curly brace

def t_lcurlybrace(t):
    r'{'
    t.lexer.begin('macro')

... some other tokens for macro state

def t_macro_rcurlybrace(t):
    r'}'
    t.lexer.begin('INITIAL')

So basically it works like this:

Two plus two is {2 + 2}

Lexer produces tokens like STRING, NUMBER, OPERATOR, NUMBER for this line.

But I'm stuck with error handling. If input is

Two plus two is {2 + 2

lexer produces the same stream of tokens as before. The only difference is the state of lexer in the end (macro, not INITIAL).

I want to raise an error in such case, but I can't find any built in hooks in lex for such task. Now my guess is to wrap a lexer in a wrapper, which will check the state of the lexer, when all input is consumed.

UPDATE:

I tried to use t_eof like this: def t_eof(t): if t.lexer.current_state() != 'INITIAL': raise Exception('Unbalanced brackets')

but it didn't work.

UPDATE2:

t_eof must be defined as t_macro_eof, as EOF is reached during "macro" state, so it can be done like:

def t_macro_eof(t):
    raise Exception('Unbalanced brackets')

Solution

  • You can check the state in the EOF handler