Search code examples
python-3.xpyparsing

PyParsing: using SkipTo(), labeled data and possibly a Forward()


I am trying to parse an input file given the following format.

file = "Begin 'big section header' 
          #... section contents ...
          sub 1: value
          sub 2: value
          ....
          Begin 'interior section header'
          ....
          End 'interior section header'

        End 'big section header'"

to return a list that greedily grabs everything between the labeled section header value

['section header', ['section contents']]

my current attempt looks like this

import pyparsing as pp

begin = pp.Keyword('Begin')
header = pp.Word(pp.alphanums+'_')
end = pp.Keyword('End')
content = begin.suppress() + header + pp.SkipTo(end + header)

content.searchString(file).asList()

returns

['section header', ['section contents terminated at the first end and generic header found']]

i suspect my grammar needs to be changed to some form of

begin = pp.Keyword('Begin')
header = pp.Word(pp.alphanums+'_')
placeholder = pp.Forward()
end = pp.Keyword('End')

placeholder << begin.suppress() + header
content =  placeholder + pp.SkipTo(end + header)

but I cant for the life of me figure out the correct assignment to the Forward object that doesn't give me what I already have.


Solution

  • Even easier than Forward in this case would be to use matchPreviousLiteral:

    content = begin.suppress() + header + pp.SkipTo(end + matchPreviousLiteral(header))
    

    You are matching any end, but what you want is the end that matches the previous begin.