Search code examples
pythonpython-3.xstringtext-parsingpyparsing

String between two markers using PyParsing


I need to obtain a string between two markers using PyParsing.

From the string s = 'qwertyAAA1234ZZZazerty' I want to retrieve the string between AAA and ZZZ which is 1234.

So far I'm able to do it using searchString(). Is it possible to obtain the same result with parseString()?

Using PyParsing's searchString()

import pyparsing as pp

word = pp.Word(pp.alphanums)

s = 'qwertyAAA1234ZZZazerty'
rule = pp.nestedExpr('AAA', 'ZZZ')
for match in rule.searchString(s):
    print(match)

which yields:

[['1234']]

Using PyParsing's parseString()

import pyparsing as pp

word = pp.Word(pp.alphanums)

s = 'gfgfdAAA1234ZZZuijjk'
rule = pp.nestedExpr('AAA', 'ZZZ')
print(rule.parseString(s))

which yields:

Traceback (most recent call last):
  File "main.py", line 14, in <module>
    print(rule.parseString(s))
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1939, in parseString
    raise exc
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1929, in parseString
    loc, tokens = self._parse(instring, 0)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1669, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 4430, in parseImpl
    return self.expr._parse(instring, loc, doActions, callPreParse=False)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1669, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 4430, in parseImpl
    return self.expr._parse(instring, loc, doActions, callPreParse=False)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1669, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 4020, in parseImpl
    loc, resultlist = self.exprs[0]._parse(instring, loc, doActions, callPreParse=False)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1673, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 4430, in parseImpl
    return self.expr._parse(instring, loc, doActions, callPreParse=False)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 1673, in _parseNoCache
    loc, tokens = self.parseImpl(instring, preloc, doActions)
  File "/home/runner/.local/share/virtualenvs/python3/lib/python3.7/site-packages/pyparsing.py", line 2871, in parseImpl
    raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected "AAA", found 'g'  (at char 0), (line:1, col:1)

Solution

  • Try SkipTo, or its shortcut ...:

    print((... + rule).parseString(s).dump())
    

    Gives:

    ['gfgfd', ['1234']]
    - _skipped: ['gfgfd']