Search code examples
pythonpyparsing

Concatenated ternary operators with pyparsing


Using pyparsing, I'd like to be able to parse the following syntax:

1?1:0?1:0

It should be understood as a standard ternary operator condition ? true_part : false_part, where two of them are simply concatenated, so that the result of the first makes the condition of the second.

So far I have the following code (simplified):

import pyparsing as pp

TERNARY_INFIX = pp.infixNotation(
    pp.pyparsing_common.integer, [
        (("?", ":"), 3, pp.opAssoc.LEFT),
])

TERNARY_INFIX.parseString("1?1:0?1:0", parseAll=True)

Which yields:

ParseException: Expected end of text (at char 5), (line:1, col:6)

unless I add parentheses around one of the two ternary expressions, so for example "(1?1:0)?1:0" and "1?1:(0?1:0)" works.

But how can I make it work without the brackets, basically just reading from left to right, in a strictly left-associative way?

EDIT:

Nice read on how associativity works for ternary operators: Ternary operator left associativity - with the result that left-assoc doesn't make much sense. Yet, the language I'm trying to mimic does in fact treat such expressions from left to right.


Solution

  • I think this operator is actually right-associative, not left. If I change your code to:

    import pyparsing as pp
    
    TERNARY_INFIX = pp.infixNotation(
        pp.pyparsing_common.integer, [
            (("?", ":"), 3, pp.opAssoc.RIGHT),
    ])
    
    TERNARY_INFIX.runTests("""\
    1?1:(0?1:0)
    (1?1:0)?1:0
    1?1:0?1:0
    """, fullDump=False)
    

    Then I get reasonable output, and no error for the input without parens:

    1?1:(0?1:0)
    [[1, '?', 1, ':', [0, '?', 1, ':', 0]]]
    
    (1?1:0)?1:0
    [[[1, '?', 1, ':', 0], '?', 1, ':', 0]]
    
    1?1:0?1:0
    [[1, '?', 1, ':', [0, '?', 1, ':', 0]]]
    

    Here is a larger expression to evaluate the largest of 3 variables (from this C tutorial: http://cprogramming.language-tutorial.com/2012/01/biggest-of-3-numbers-using-ternary.html):

    TERNARY = pp.infixNotation(
        pp.Char("abc"), [
            (pp.oneOf("> <"), 2, pp.opAssoc.LEFT), 
            (("?", ":"), 3, pp.opAssoc.RIGHT),
        ])
    TERNARY.runTests("""\
    (a > b) ? ((a > c) ? a : c) : ((b > c) ? b : c) 
    a > b ? a > c ? a : c : b > c ? b : c
    """, fullDump=False)
    

    Gives:

    (a > b) ? ((a > c) ? a : c) : ((b > c) ? b : c)
    [[['a', '>', 'b'], '?', [['a', '>', 'c'], '?', 'a', ':', 'c'], ':', [['b', '>', 'c'], '?', 'b', ':', 'c']]]
    
    a > b ? a > c ? a : c : b > c ? b : c
    [[['a', '>', 'b'], '?', [['a', '>', 'c'], '?', 'a', ':', 'c'], ':', [['b', '>', 'c'], '?', 'b', ':', 'c']]]
    

    EDIT: I see now that this a similar situation to repeated binary operators, like "1 + 2 + 3". Left-associative, pyparsing parses them not as [['1' '+' '2'] '+' '3'], but just ['1' '+' '2' '+' '3'], and it is up to the evaulator to do the repetitive left-to-right evaluation.

    When I added the ternary operator, I did not envision a chained form such as the one you are parsing. A one-line change to infixNotation will parse your expression successfully with left-associativity, but like the chained binary operators gives an ungrouped result:

    [1, '?', 1, ':', 0, '?', 1, ':', 0]
    

    Like the repeated addition example, it is up to the evaluator to do the successive left-to-right evaluation, somcething like:

    def eval_ternary(tokens):
        operands = tokens[0]
        ret = bool(operands[0])
        i = 1
        while i < len(operands):
            ret = bool(operands[i+1]) if ret else bool(operands[i+3])
            i += 4
        return ret
    

    If you want to hand-patch your pyparsing code, change:

           elif arity == 3:
                matchExpr = _FB(
                    lastExpr + opExpr1 + lastExpr + opExpr2 + lastExpr
                ) + Group(lastExpr + opExpr1 + lastExpr + opExpr2 + lastExpr)
    

    to:

           elif arity == 3:
                matchExpr = _FB(
                    lastExpr + opExpr1 + lastExpr + opExpr2 + lastExpr
                ) + Group(lastExpr + OneOrMore(opExpr1 + lastExpr + opExpr2 + lastExpr))
                                     ^^^^^^^^^^
    

    Make this change in pyparsing.py, or copy the definition of infxNotation into your own code and change it there.

    I'll make this change in the next release of pyparsing.

    EDIT - Fixed in pyparsing 2.4.6, just released.