Search code examples
pythonpyparsing

How to a match any sequence of text until "#{" in pyparsing


I want to do string templating with py parsing. I have failed multiple times to define a grammar that can parse a string with inserted text inbetween "#{" and "}" (seems pretty basic) :

Example of grammar which doesn't work :

from pyparsing import CharsNotIn, FollowedBy, Group, Literal, OneOrMore, Suppress
# Define the unwanted sequence
hash_tag = Suppress(Literal('#{')) + CharsNotIn('}') + Suppress(Literal('}'))
text_part = CharsNotIn('#{')
# Combine the parts to match the entire string
parser = OneOrMore(Group(hash_tag) | text_part)
test_string = "Some text \n with a stranded # and a stranded { then a correct #{ insertion of \n some # other text } and then some text"

result = parser.searchString(test_string)
expected_result = ["Some text \n with a stranded # and a stranded { then a correct ",[" insertion of \n some # other text "]," and then some text"]
print(result)

The result is not the expected result (I get ["Some text \n with a stranded "])


Solution

  • I would use pyparsing's QuotedString class to define your template field, like this:

    import pyparsing as pp
    
    template_field = pp.QuotedString("#{", endQuoteChar="}", 
                                     escChar="\\", 
                                     multiline=True, 
                                     unquoteResults=True)
    test_string = "Some text \n with a stranded # and a stranded { then a correct #{ insertion of \n some # other text } and then some text"
    
    # just search for the template field
    result = template_field.search_string(test_string)
    expected_result = [[" insertion of \n some # other text "]]
    print(result)
    print(expected_result)
    

    Since this is a templating kind of program, transform_string might do much of the search-and-replace work for you, if you can implement the transformation as a parse action:

    # Using transform_string to transform the template_field
    # (add some kind of transformation to the template_field using a parse action)
    template_field.add_parse_action(lambda s, l, t: t[0].upper())
    print(template_field.transform_string(test_string))