Is it possible for nestedExpr
to preserve newlines?
Here is a simple example:
import pyparsing as pp
# Parse expressions like: \name{body}
name = pp.Word( pp.alphas )
body = pp.nestedExpr( '{', '}' )
expr = '\\' + name('name') + body('body')
# Example text to parse
txt = '''
This \works{fine}, but \it{
does not
preserve newlines
}
'''
# Show results
for e in expr.searchString(txt):
print 'name: ' + e.name
print 'body: ' + str(e.body) + '\n'
Output:
name: works
body: [['fine']]
name: it
body: [['does', 'not', 'preserve', 'newlines']]
As you can see, the body of the second expression (\it{ ...
) is parsed despite the newlines in the body, but I would have expected the result to store each line in a separate subarray. This result makes it impossible to distinguish body contents with single vs. multiple lines.
I didn't get to look at your answer until just a few minutes ago, and I had already come up with this approach:
body = pp.nestedExpr( '{', '}', content = (pp.LineEnd() | name.setWhitespaceChars(' ')))
Changing body
to this definition gives these results:
name: works
body: [['fine']]
name: it
body: [['\n', 'does', 'not', '\n', 'preserve', 'newlines', '\n']]
EDIT:
Wait, if what you want are the separate lines, then perhaps this is more what you are looking for:
single_line = pp.OneOrMore(name.setWhitespaceChars(' ')).setParseAction(' '.join)
multi_line = pp.OneOrMore(pp.Optional(single_line) + pp.LineEnd().suppress())
body = pp.nestedExpr( '{', '}', content = multi_line | single_line )
Which gives:
name: works
body: [['fine']]
name: it
body: [['does not', 'preserve newlines']]