A mock version of the code that I am trying is the following. I have many more cases of what a SingleValue
is, and other constructions, but this is the part I am failing to represent.
import pyparsing as pp
SingleValue = pp.Word(pp.alphas) # Single values are strings of letters
ListOfValues = pp.Forward() # To be defined later
Values = SingleValue ^ ('{' + ListOfValues + '}') # Values are either words or braces-enclosed lists
ListOfValues <<= pp.delimited_list(Values, delim=' ') # The lists inside the braces are space-separated values.
print(ListOfValues.parse_string('{aaa bbb {ccc ddd}}'))
The comments indicate what I am expecting to represent in each line. The intension is that the string in the example is a valid ListOfValues
: A braces enclosed, space separated, of either words or further lists.
The sample code gives
pyparsing.exceptions.ParseException: Expected {W:(A-Za-z) ^ {'{' Forward: None '}'}}, found 'bbb' (at char 5), (line:1, col:6)
I also tried
Values = SingleValue ^ ListOfValues
ListOfValues <<= '{' + pp.delimited_list(Values, delim=' ') + '}'
This gives
pyparsing.exceptions.ParseException: Expected '}', found 'bbb' (at char 5), (line:1, col:6)
How to define this?
It looks like the following works in this case
SingleValue = pp.Word(pp.alphas) # Single values are strings of letters
Values = pp.Forward()
ListOfValues = '{' + pp.delimited_list(Values[...], delim=' ') + '}' # The lists inside the braces are space-separated values.
Values <<= SingleValue | ListOfValues # Values are either words or braces-enclosed lists
print(ListOfValues.parse_string('{aaa bbb {ccc ddd}}'))
This gives
['{', 'aaa', 'bbb', '{', 'ccc', 'ddd', '}', '}']
Although I still don't understand the reasoning why the first version in the question is not correct.
The pyparsing module’s default behaviour is to ignore the leading whitespace. (see 1.1.2 Usage notes)
That means Literal(' ')
won't match and delimited_list
will stop parsing
For non-skipping whitespace, there is pp.White
:
ListOfValues <<= '{' + pp.delimited_list(Values, delim=pp.White(' ')) + '}'
You could also use Values[...]
instead, although it will accept any number of whitespaces as a delimiter:
ListOfValues <<= '{' + Values[...] + '}'