I've used pyparsing in the past but only for small tasks, this time I'm trying to use it for something more complicated.
I'm trying to skip over a VHDL architecture block, which looks like this
architecture Behav of Counter is
...many statements I'm not interested in at this point...
end architecture;
Here's what I tried:
import pyparsing as pp
pp_identifier = pp.Regex(r'([a-zA-Z_][\w]*)')('identifier')
def Keyword(matchString):
'VHDL keywords are caseless and consist only of alphas'
return pp.Keyword(matchString, identChars=pp.alphas, caseless=True)
pp_architecture = (
Keyword('architecture')
+ pp_identifier
+ Keyword('of').suppress()
+ pp_identifier
+ Keyword('is').suppress()
+ Keyword('end')
+ Keyword('architecture')
)
print(pp_architecture.parseString('''
architecture beh of sram is end architecture
''', parseAll=True))
# this works as I expected, it prints
# ['architecture', 'beh', 'sram', 'end', 'architecture']
But after changing pp_architecture
to use SkipTo
it fails:
pp_architecture = (
Keyword('architecture')
+ pp_identifier
+ Keyword('of').suppress()
+ pp_identifier
+ Keyword('is').suppress()
+ pp.SkipTo(Keyword('end') + Keyword('architecture'))
)
print(pp_architecture.parseString('''
architecture beh of sram is end architecture
''', parseAll=True))
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "C:\Python27\lib\site-packages\pyparsing.py", line 1125, in parseString
raise exc
pyparsing.ParseException: Expected end of text (at char 29), (line:2, col:29)
I also tried adding other text between is
and end
(which I expect to be skipped) to be sure it's not a problem of having an "empty skip", but that didn't help either.
What am I doing wrong?
SkipTo
skips to the matching text, but by default does not parse that text. So you are advancing the parse location to 'end architecture', but not actually parsing it.
You can either:
Keyword('end') + Keyword('architecture')
after your SkipTo
expression, orinclude=True
in the constructor of your SkipTo
expression, telling it to skip to and parse the given strings.