In pyparsing I'm looking for a simple way to match words (or other expressions) that occur on the same line, i.e. without any newline in between them.
You can override the default whitespace-skipping characters for a particular parser element - in this case, the word_on_the_same_line
only skips spaces, but not newlines.
import pyparsing as pp
word = pp.Word(pp.alphas, pp.alphanums)
# define special whitespace skipping, so that newlines aren't
# skipped when matching a word_on_the_same_line
word_on_the_same_line = word().setWhitespaceChars(" ")
# compare results with this version of word_on_the_same_line to see
# how pyparsing treats newlines as skippable whitespace
# word_on_the_same_line = word()
line = pp.Group(word("key") + word_on_the_same_line[...]("values"))
test = """\
key1 lsdkjf lskdjf lskjdf sldkjf
key2 sdlkjf lskdj lkjss lsdj
"""
print(line[...].parseString(test).dump())
Prints:
[['key1', 'lsdkjf', 'lskdjf', 'lskjdf', 'sldkjf'], ['key2', 'sdlkjf', 'lskdj', 'lkjss', 'lsdj']]
[0]:
['key1', 'lsdkjf', 'lskdjf', 'lskjdf', 'sldkjf']
- key: 'key1'
- values: ['lsdkjf', 'lskdjf', 'lskjdf', 'sldkjf']
[1]:
['key2', 'sdlkjf', 'lskdj', 'lkjss', 'lsdj']
- key: 'key2'
- values: ['sdlkjf', 'lskdj', 'lkjss', 'lsdj']