In my very deep PyParsing (132 keywords), I've ran into some quirky things. It's probably my use of logic. But then again, may be not.
ISC Bind9 Configuration file has clauses (kinda like INI section):
options
)ZeroOrMore()
Any attempts to add parser complexity to the mandatory options
clause caused breakage to the above logic.
I've had to peel away the non-impacting parser logic until it started working, then had to rock the code back and forth until I arrived at the exact breakage caused by the introduction of this pyparsing code:
print("Using 'example1' as a Word() to inside 'options{ };':")
clauses_mandatory_complex = (
Keyword('options')
+ Literal('{')
+ Word('[a-zA-Z0-9]')
+ Literal(';')
+ Literal('}')
+ Literal(';')
)
As a standalone ParserElement
, this clause_mandatory_complex
works just fine.
Until I attempted to introduce the clause logic:
# Exactly one parse_element ('options' clause)
# and any number of other clauses
clauses_all_and = (
clause_mandatory_complex
& ZeroOrMore(clauses_zero_or_more)
)
And its clause logic starts failing.
If I take out the Word()
, like this:
print("Using 'example1' as a Literal() to inside 'options{ };':")
clauses_mandatory_simple = (
Keyword('options')
+ Literal('{')
+ Literal('example1')
+ Literal(';')
+ Literal('}')
+ Literal(';')
)
My clause logic starts working again as expected.
This is too strange for me, so I posted it here.
Below is a working standalone test program that demonstrate the differences that was given above:
#!/usr/bin/env python3
from pyparsing import ZeroOrMore, Word, Keyword, Literal
from pprint import PrettyPrinter
pp = PrettyPrinter(width=81, indent=4)
clauses_zero_or_more = (
(Keyword('acl') + ';')
| (Keyword('server') + ';')
| (Keyword('view') + ';')
| (Keyword('zone') + ';')
)
def test_me(parse_element, test_data, fail_assert):
# Exactly one parse_element ('options' clause)
# and any number of other clauses
clauses_all_and = (
parse_element
& ZeroOrMore(clauses_zero_or_more)
)
result = clauses_all_and.runTests(test_data, parseAll=True, printResults=True,
failureTests=fail_assert)
pp.pprint(result)
return result
def print_all_results(pass_result, fail_result):
print("Purposely passed test: {}. ".format(pass_result[0]))
print("Purposely failed test: {}. ".format(fail_result[0]))
print('\n')
passing_test_data = """
options { example1; };
acl; options { example1; };
options { example1; }; acl;
options { example1; }; server;
server; options { example1; };
acl; options { example1; }; server;
acl; server; options { example1; };
options { example1; }; acl; server;
options { example1; }; server; acl;
server; acl; options { example1; };
server; options { example1; }; acl;
"""
failing_test_data = """
acl;
acl; acl;
server; acl;
server;
acl; server;
options { example1; }; options { example1; };
"""
print("Using 'example1' as a Literal() to inside 'options{ };':")
clauses_mandatory_simple = (
Keyword('options')
+ Literal('{')
+ Literal('example1')
+ Literal(';')
+ Literal('}')
+ Literal(';')
)
pass_result = test_me(clauses_mandatory_simple, passing_test_data, False)
fail_result = test_me(clauses_mandatory_simple, failing_test_data, True)
print_all_results(pass_result, fail_result)
# Attempted to introduced some more qualifiers to 'options' failed
print("Using 'example1' as a Word() to inside 'options{ };':")
clauses_mandatory_complex = (
Keyword('options')
+ Literal('{')
+ Word('[a-zA-Z0-9]')
+ Literal(';')
+ Literal('}')
+ Literal(';')
)
pass_result = test_me(clauses_mandatory_complex, passing_test_data, False)
fail_result = test_me(clauses_mandatory_complex, failing_test_data, True)
print_all_results(pass_result, fail_result)
The output of the test run is given below:
/work/python/parsing/isc_config2/how-bad.py
Using 'example1' as a Literal() to inside 'options{ };':
options { example1; };
['options', '{', 'example1', ';', '}', ';']
acl; options { example1; };
['acl', ';', 'options', '{', 'example1', ';', '}', ';']
options { example1; }; acl;
['options', '{', 'example1', ';', '}', ';', 'acl', ';']
options { example1; }; server;
['options', '{', 'example1', ';', '}', ';', 'server', ';']
server; options { example1; };
['server', ';', 'options', '{', 'example1', ';', '}', ';']
acl; options { example1; }; server;
['acl', ';', 'options', '{', 'example1', ';', '}', ';', 'server', ';']
acl; server; options { example1; };
['acl', ';', 'server', ';', 'options', '{', 'example1', ';', '}', ';']
options { example1; }; acl; server;
['options', '{', 'example1', ';', '}', ';', 'acl', ';', 'server', ';']
options { example1; }; server; acl;
['options', '{', 'example1', ';', '}', ';', 'server', ';', 'acl', ';']
server; acl; options { example1; };
['server', ';', 'acl', ';', 'options', '{', 'example1', ';', '}', ';']
server; options { example1; }; acl;
['server', ';', 'options', '{', 'example1', ';', '}', ';', 'acl', ';']
( True,
[ ( 'options { example1; };',
(['options', '{', 'example1', ';', '}', ';'], {})),
( 'acl; options { example1; };',
(['acl', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
( 'options { example1; }; acl;',
(['options', '{', 'example1', ';', '}', ';', 'acl', ';'], {})),
( 'options { example1; }; server;',
(['options', '{', 'example1', ';', '}', ';', 'server', ';'], {})),
( 'server; options { example1; };',
(['server', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
( 'acl; options { example1; }; server;',
(['acl', ';', 'options', '{', 'example1', ';', '}', ';', 'server', ';'], {})),
( 'acl; server; options { example1; };',
(['acl', ';', 'server', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
( 'options { example1; }; acl; server;',
(['options', '{', 'example1', ';', '}', ';', 'acl', ';', 'server', ';'], {})),
( 'options { example1; }; server; acl;',
(['options', '{', 'example1', ';', '}', ';', 'server', ';', 'acl', ';'], {})),
( 'server; acl; options { example1; };',
(['server', ';', 'acl', ';', 'options', '{', 'example1', ';', '}', ';'], {})),
( 'server; options { example1; }; acl;',
(['server', ';', 'options', '{', 'example1', ';', '}', ';', 'acl', ';'], {}))])
acl;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
acl; acl;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
server; acl;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
server;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
acl; server;
^
FAIL: Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
options { example1; }; options { example1; };
^
FAIL: Expected end of text, found 'o' (at char 23), (line:1, col:24)
( True,
[ ( 'acl;',
Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'acl; acl;',
Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'server; acl;',
Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)),
( 'server;',
Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)),
( 'acl; server;',
Missing one or more required elements ({"options" "{" "example1" ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'options { example1; }; options { example1; };',
Expected end of text, found 'o' (at char 23), (line:1, col:24))])
Purposely passed test: True.
Purposely failed test: True.
Using 'example1' as a Word() to inside 'options{ };':
/usr/local/lib/python3.7/site-packages/pyparsing.py:3161: FutureWarning: Possible nested set at position 1
self.re = re.compile(self.reString)
options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)
acl; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
options { example1; }; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)
options { example1; }; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)
server; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
acl; options { example1; }; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
acl; server; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
options { example1; }; acl; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)
options { example1; }; server; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)
server; acl; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
server; options { example1; }; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
( False,
[ ( 'options { example1; };',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)),
( 'acl; options { example1; };',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'options { example1; }; acl;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)),
( 'options { example1; }; server;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)),
( 'server; options { example1; };',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)),
( 'acl; options { example1; }; server;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'acl; server; options { example1; };',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'options { example1; }; acl; server;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)),
( 'options { example1; }; server; acl;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)),
( 'server; acl; options { example1; };',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)),
( 'server; options { example1; }; acl;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1))])
acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
acl; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
server; acl;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)
acl; server;
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)
options { example1; }; options { example1; };
^
FAIL: Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1)
( True,
[ ( 'acl;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'acl; acl;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'server; acl;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)),
( 'server;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 's' (at char 0), (line:1, col:1)),
( 'acl; server;',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'a' (at char 0), (line:1, col:1)),
( 'options { example1; }; options { example1; };',
Missing one or more required elements ({"options" "{" W:([a-z...) ";" "}" ";"}), found 'o' (at char 0), (line:1, col:1))])
Purposely passed test: False.
Purposely failed test: True.
EDIT: Found error here:
Word('[a-zA-Z0-9]')
should be:
Word(srange('[a-zA-Z0-9]'))
Is there a way to improve the Caret ‘^’ positioning of that error as to being pointing at the test data ‘example1’ and not at the keyword? That would have saved a lot of time here.
The basic answer to questions like this is usually to replace one or a few '+'
operators with '-'
operators. '-'
tells pyparsing to disable backtracking if an error is found in some subsequent match.
For instance, if you have a keyword in your grammar that is used nowhere else, then you should reasonably expect that any parse errors after that keyword are true errors, and not just mismatched alternatives. Following this keyword with '-'
is a good way to get your parser to indicate a specific error location, instead of just flagging that none of a set of higher-level alternatives was a match.
You do have to be careful with '-', and not just replace all instances of '+' with '-', since this would defeat all backtracking, and could keep your parser from matching legitimate alternative expressions.
So I was about to post that the following would improve your error messages:
clauses_mandatory_complex = (
Keyword('options')
- Literal('{')
+ Word('[a-zA-Z0-9]')
+ Literal(';')
+ Literal('}')
+ Literal(';')
)
But when I tried it, I didn't really get much better results. In this case, the confounding issue is your use of '&'
to get out-of-order Each matching, which, while perfectly legitimate in your parser, mixes up the exception handling (possibly uncovering a bug in pyparsing). If you replace '&'
with '+'
in your clauses_all_and
expression, you'll see the '-'
operator at work here:
options { example1; };
^(FATAL)
FAIL: Expected W:([a-z...), found 'e' (at char 10), (line:1, col:11)
And in fact, this points to a general debugging tactic with pyparsing: try out sub-expressions in isolation if complex expressions are not giving helpful exception messages.
Pyparsing does a lot of backtracking and retries when working with a grammar containing MatchFirst or Or expressions ('|'
and '^'
operators), but even more so when dealing with Each ('&'
operator). In your case, when I used '-'
operator a non-backtracking exception was raised, but Each demoted it to a backtracking one so that it could continue trying other combinations. I will look at this further to see if there is a good way to avoid this demotion.