Search code examples
python-3.xparsingply

Get a string output using Rply throws ParserGeneratorError: Expecting :


Trying to implement parser using rply its just like ply

#Input = ['ABC']
lg = LexerGenerator()
lg.add('String', r'\D')

l = lg.build()
for token in l.lex('ABC'):
  print(token)

Generated Lexer by above code

class String(BaseBox):
  def __init__(self, value):
    self.value = value 
  def eval(self):
    return self.value


from rply import ParserGenerator
pg = ParserGenerator(
# A list of all token names, accepted by the parser. 
     ['String']
)
@pg.production('program: String')
def program(p):
    return p[0].value

parser = pg.build() #should parse string 'ABC'

ParserGeneratorError: Expecting :

Confused this isn't even in the documentation, please reply. Want my output to read the stringn as 'ABC'


Solution

  • Your code indicates that you tested the lexer. However, the test indicates that the lexer is not producing the correct tokens:

    >>> for token in l.lex('ABC'):
    ...   print(token)
    ... 
    Token('String', 'A')
    Token('String', 'B')
    Token('String', 'C')
    

    The expected output would have been

    Token('String', 'ABC')
    

    The reason you're splitting on individual characters is that your pattern for recognising String only matches a single character:

    lg.add('String', r'\D')
    

    Probably you wanted something more like

    lg.add('String', r'\D+')
    

    But note that \D matches anything which is not a decimal digit, including whitespace, punctuation and control characters. Perhaps that is what you wanted, but it seems a bit too permissive to me.


    Unrelated, but you'll have to fix this as well: as far as I know, RPLY (like Ply) requires you to write grammar rules with spaces around the colon, so your parser function needs to be

    @pg.production('program : String')
    def program(p):
        return p[0].value