REJECT equivalent in ply

What is the flex REJECT equivalent in ply? For my code I want ply to detect token LETTER and also WORD for the same text, but only LETTER tokens are detected.

import ply.lex as lex
from ply.lex import TOKEN


tokens = (
    'LETTER',
    'WORD'
)


@TOKEN(r'[a-zA-Z]')
def t_LETTER(t):
    print('L')
    return t


@TOKEN(rf'{t_LETTER}*')
def t_WORD(t):
    print('W')
    return t


# Error handling rule

def t_error(t):
    print("Illegal character '%s'" % t.value[0])
    t.lexer.skip(1)

 # Build the lexer
lexer = lex.lex()

# Test it out

# Give the lexer some input
while True:
    lexer.input(input())

    # Tokenize
    while True:
        tok = lexer.token()
        if not tok:
            break      # No more input
        print(tok)

When I execute the code for the input av the output is: L LexToken(LETTER,'a',1,0) L LexToken(LETTER,'v',1,1) But I want the token WORD to be also detected. In flex I have REJECT for this but in ply I coudn't find an alternative yet.

Solution

There is no equivalent to beREJECT in Ply. But that's not why your program doesn't recognize WORD tokens; those aren't recognised because when Python expands f'{t_LETTER}*', it does not produce '[a-zA-Z]*' since the value of t_LETTER is a function, not a string.

Using REJECT in the WORD action in (f)lex might not be what you're looking for either, but in any case REJECT is an extremely inefficient operation and is not recommended for modern code. Flex would tokenise abcd as

WORD abc
WORD ab
WORD a
LETTER a
WORD bcd
WORD bc
WORD b
LETTER b
WORD cd
WORD c
LETTER c
WORD d
LETTER d

Maybe that's what you expect, but it seems a bit odd to me. In both Ply and flex, you can achieve similar results by using a combination of pushing characters back into the input stream (using yyless or unput in flex, or modifying lex.lexpos in Ply), and changing the lexer state using start conditions.