I want to filter out comments starting with a hash # out of a text file, before I run a larger parser over it.
For this I make use of suppress as mentioned here.
pythonStyleComment does not work, because it ignores quotations and removes stuff within it. A hash in a quoted string is not a comment. It is part of the string and therefore should be preserved.
Here is my pytest which I already implemented to test the expected behavior.
def test_filter_comment():
teststrings = [
'# this is comment', 'Option "sadsadlsad#this is not a comment"'
]
expected = ['', 'Option "sadsadlsad#this is not a comment"']
for i, teststring in enumerate(teststrings):
result = filter_comments.transformString(teststring)
assert result == expected[i]
My current implementation breaks somewhere in pyparsing. I probably do something which was not intended:
filter_comments = Regex(r"#.*")
filter_comments = filter_comments.suppress()
filter_comments = filter_comments.ignore(QuotedString)
fails with:
*****/lib/python3.7/site-packages/pyparsing.py:4480: in ignore
super(ParseElementEnhance, self).ignore(other)
*****/lib/python3.7/site-packages/pyparsing.py:2489: in ignore
self.ignoreExprs.append(Suppress(other.copy()))
E TypeError: copy() missing 1 required positional argument: 'self'
Any help how to ignore comments correctly, would be helpful.
Ah I was so close. I have of course to properly instantiate the QuotedString class.The following works as expected:
filter_comments = Regex(r"#.*")
filter_comments = filter_comments.suppress()
qs = QuotedString('"') | QuotedString("'")
filter_comments = filter_comments.ignore(qs)
Here are some more tests.
def test_filter_comment():
teststrings = [
'# this is comment', 'Option "sadsadlsad#this is not a comment"',
"Option 'sadsadlsad#this is not a comment'",
"Option 'sadsadlsad'#this is a comment"
]
expected = [
'', 'Option "sadsadlsad#this is not a comment"',
"Option 'sadsadlsad#this is not a comment'",
"Option 'sadsadlsad'"
]
for i, teststring in enumerate(teststrings):
result = filter_comments.transformString(teststring)
assert result == expected[i]