Search code examples
pythonpyparsing

pyparsing - back to basics


In attempting to put together a very simple example to illustrate an issue I'm having with pyparsing, I have found that I can't get my simple example to work - this example is barely more complex than Hello, World!

Here is my example code:

import pyparsing as pp
import textwrap as tw

text = (
    """
    A) Red
    B) Green
    C) Blue
    """
)

a = pp.AtLineStart("A)") + pp.Word(pp.alphas)
b = pp.AtLineStart("B)") + pp.Word(pp.alphas)
c = pp.AtLineStart("C)") + pp.Word(pp.alphas)

grammar = a + b + c

grammar.run_tests(tw.dedent(text).strip())

I would expect it to return ["A)", "Red", "B)", "Green", "C)", "Blue"] but instead I get:

A) Red
A) Red
      ^
ParseException: not found at line start, found end of text  (at char 6), (line:1, col:7)
FAIL: not found at line start, found end of text  (at char 6), (line:1, col:7)

B) Green
B) Green
^
ParseException: Expected 'A)', found 'B'  (at char 0), (line:1, col:1) 
FAIL: Expected 'A)', found 'B'  (at char 0), (line:1, col:1)

C) Blue
C) Blue
^
ParseException: Expected 'A)', found 'C'  (at char 0), (line:1, col:1)
FAIL: Expected 'A)', found 'C'  (at char 0), (line:1, col:1)

Why would it say it's found end of text after the first line??? Why is it expecting A) after the first line???

(Note: textwrap.dedent() and strip() have no impact on the results of this script.)


Solution

  • My dude! You forgot to wrap raw string into a list!

    I tested for 30 mins and felt something was odd, then found this in document:

    run_tests(tests: Union[str, List[str]], ...) -> ...

    Execute the parse expression on a series of test strings, showing each test, the parsed results or where the parse failed. Quick and easy way to run a parse expression against a list of sample strings.

    Parameters:

    tests - a list of separate test strings, or a multiline string of test strings

    Basically you (and me for last half hour) by doing this:

    grammar.run_tests(tw.dedent(text).strip())
    

    ...Was telling it to treat each line as individual tests!

    """  # You are a test 0 now!
        A) Red  # You are a test 1
        B) Green  # test 2 for you,
        C) Blue  # test 3 for ya,
    """ # finally you're test 4! Mhahahah!
    

    (And of course the use of pp.line_end so line actually consume end of line)

    >>> import pyparsing as pp
    ... import textwrap as tw
    ...
    ... text = (
    ...     """
    ...     A) Red
    ...     B) Green
    ...     C) Blue
    ...     """
    ... )
    ...
    ... a = pp.AtLineStart("A)") + pp.Word(pp.alphas) + pp.line_end.suppress()
    ... b = pp.AtLineStart("B)") + pp.Word(pp.alphas) + pp.line_end.suppress()
    ... c = pp.AtLineStart("C)") + pp.Word(pp.alphas)
    ...
    ... grammar = a + b + c
    ... grammar.run_tests([tw.dedent(text).strip()])
    
    
    A) Red
    B) Green
    C) Blue
    ['A)', 'Red', 'B)', 'Green', 'C)', 'Blue']
    >>>