Suppose we have a grammar where there is an element that was expected but is not found. Assume that the backtracking is disabled, i.e. the element is preceded by a -
sign.
Is there any way in the library to set a custom error message, such as "Foo was needed", on that element? Or is the only way to catch the parse exception and work it out using the location information?
Say:
from pyparsing import *
grammar = (
Literal("//")
- Word(alphanums)("name")
+ Suppress(White()[1,...])
+ Word(alphanums)("operation")
).leave_whitespace()
grammar.run_tests("""
//name op
// opnoname
""", print_results=True)
The output for the 2nd line is:
// opnoname
^
ParseException: Expected W:(0-9A-Za-z), found ' ' (at char 2), (line:1, col:3)
FAIL: Expected W:(0-9A-Za-z), found ' ' (at char 2), (line:1, col:3)
I'd like a custom message, such as "name was needed" instead of the generic "Expected W:(0-9A0Za-z), found ' '".
So far it looks like catching the ParseException
, modifying the message
in it, and re-rasising it would be a solution. Am I missing something more fundamental?
For those curious: this came up when writing a JCL (Job Control Language) parser.
I'm using latest pyparsing stable version at the moment: 3.0.9.
They aren't custom error messages, but you can use set_name
to give nice
names to the different elements of your grammar.
identifier = Word(alphanums).set_name("identifier")
grammar = (
Literal("//")
- identifier("name")
+ White().suppress()
+ identifier("operation")
).leave_whitespace()
grammar.set_name("grammar")
Note the difference between set_name
and set_results_name
.
set_name
gives a name to the expression itself, while
set_results_name
(which you implicitly call when using the
expr("name")
notation) is what assigns names to the parsed
results.
It is easier to see the distinction if you generate a
railroad diagram, and add show_results_names=True
:
grammar.create_diagram("grammar.html", show_results_names=True)
I've always wanted to let pyparsing deal with whitespace
when I can, and not explicitly show it in my grammar.
If you want to enforce no spaces between the leading '//'
and the name
identifier, you can write like this:
grammar = (
Literal("//")
- identifier("name").leave_whitespace()
+ identifier("operation")
)
Pyparsing's implicit whitespace skipping will take care
of the spaces between name and operation. leave_whitespace
on the name
alone tells pyparsing not to skip whitespace
before parsing the name
identifier. But you may have other
plans for further parts of this grammar, so I'll leave it
up to you which way to go on this.
I'm glad to see you are using run_tests
! Here is a tip: you
can insert comments in your tests, and they will show up as
labels for each test in your output, like this:
grammar.run_tests("""\
# successful expression
//name op
# more than one space between name and operation - still works!
//name op
# failing expression, missing second identifier
// opnoname
# failing expression, name but no operation
//namenoopn
# failing expression, space after '//'
// name op
""")
and get this output:
# successful expression
//name op
['//', 'name', 'op']
- name: 'name'
- operation: 'op'
# more than one space between name and operation - still works!
//name op
['//', 'name', 'op']
- name: 'name'
- operation: 'op'
# failing expression, missing second identifier
// opnoname
// opnoname
^
ParseSyntaxException: Expected identifier, found ' ' (at char 2), (line:1, col:3)
FAIL: Expected identifier, found ' ' (at char 2), (line:1, col:3)
# failing expression, name but no operation
//namenoopn
//namenoopn
^
ParseSyntaxException: Expected identifier, found end of text (at char 11), (line:1, col:12)
FAIL: Expected identifier, found end of text (at char 11), (line:1, col:12)
# failing expression, space after '//'
// name op
// name op
^
ParseSyntaxException: Expected identifier, found ' ' (at char 2), (line:1, col:3)
FAIL: Expected identifier, found ' ' (at char 2), (line:1, col:3)
You can also insert blank spaces between tests for readability, just like you would insert blank lines in your Python code.