Search code examples
pythonparsinggrammarbnflark-parser

How do I add operator precedence to a lark grammar for FOL with Equality?


How do I modify this grammar so it matches parenthesis that are further away?

?wff: compound_wff
?compound_wff: biconditional_wff
?biconditional_wff: conditional_wff (SPACE? BICONDITIONAL_SYMBOL SPACE? biconditional_wff)*
?conditional_wff: disjunctive_wff (SPACE? CONDITIONAL_SYMBOL SPACE? conditional_wff)*
?disjunctive_wff: conjunctive_wff (SPACE? DISJUNCTION_SYMBOL SPACE? disjunctive_wff)*
?conjunctive_wff: negated_wff (SPACE? CONJUNCTION_SYMBOL SPACE? conjunctive_wff)*
?negated_wff: (NEGATION_SYMBOL SPACE)* atomic_wff
?atomic_wff: predicate
           | term EQUAL_TO term
           | quantified_wff* LEFT_PARENTHESIS SPACE? wff SPACE? RIGHT_PARENTHESIS
?term: function
    | NAME
    | VARIABLE
?predicate: PREDICATE_NAME [LEFT_PARENTHESIS term (COMMA term)* RIGHT_PARENTHESIS]
?function: FUNCTION_NAME LEFT_PARENTHESIS term (COMMA SPACE? term)* RIGHT_PARENTHESIS
?quantified_wff: curly_quantifiers | quantifiers
?curly_quantifiers: quantifier_symbol LEFT_CURLY_BRACE VARIABLE (COMMA SPACE? VARIABLE)* RIGHT_CURLY_BRACE SPACE?
?quantifiers: quantifier_symbol SPACE? VARIABLE (COMMA SPACE? VARIABLE)* SPACE?

SPACE: /\s+/
COMMA: ","
EQUAL_TO: "="
LEFT_PARENTHESIS: "("
RIGHT_PARENTHESIS: ")"
LEFT_CURLY_BRACE: "{"
RIGHT_CURLY_BRACE: "}"
quantifier_symbol: UNIVERSAL_QUANTIFIER_SYMBOL | EXISTENTIAL_QUANTIFIER_SYMBOL
UNIVERSAL_QUANTIFIER_SYMBOL: "\\forall" | "∀"
EXISTENTIAL_QUANTIFIER_SYMBOL: "\\exists" | "∃"
NAME: /[a-t]/ | /[a-t]_[1-9]\d*/
VARIABLE: /[u-z]/ | /[u-z]_[1-9]\d*/
PREDICATE_NAME: /[A-HJ-Z]/ | /[A-HJ-Z]_[1-9]\d*/
FUNCTION_NAME: /[a-z]/ | /[a-z]_[1-9]\d*/
NEGATION_SYMBOL: "\\neg" | "\\lnot" | "¬"
CONJUNCTION_SYMBOL: "\\wedge" | "\\land" | "∧"
DISJUNCTION_SYMBOL: "\\vee" | "\\lor" | "∨"
CONDITIONAL_SYMBOL: "\\rightarrow" | "\\Rightarrow" | "\\Longrightarrow" | "\\implies" | "→" | "⇒"
BICONDITIONAL_SYMBOL: "\\leftrightarrow" | "\\iff" | "↔" | "⇔"

I'm trying to parse this using my grammar:

equation

Which in LaTeX is:

\exists{x} \forall{y} (P(f(x, y)) \vee \forall{z}(V(z) \iff \neg R(a) \wedge B(a)))

I followed the calculator example and modified my original grammar to add operator precedence which resulted in this. But it's no longer accepting the input string.

I'm getting this error:

lark.exceptions.UnexpectedCharacters: No terminal matches '\' in the current parser context, at line 1 col 35

\exists{x} \forall{y} (P(f(x, y)) \vee \forall{z}(V(z) \iff \neg R(a) \wed
                                  ^
Expected one of: 
    * RIGHT_PARENTHESIS

Ideally I want to force requiring parenthesis wherever possible except in the case of in front of a negated atomic_wff without parenthesis. This is to make sure only one parse tree is produced even on explicit ambiguity setting. How do I resolve this issue?

Edit 1

I want to clarify that operator precedence for the same operators should be right associative. So P(a) ∧ Q(a) ∧ R(a) will resolve as P(a) ∧ (Q(a) ∧ R(a))

Edit 2

I have made the lark grammar easier to debug using proper terminals. It now parses the long latex equation, but is still ambiguous for simpler inputs like P(a) ∧ Q(a) ∧ R(a) which produces two parse trees. I still don't know what I am doing wrong. The grammar is doing right recursion and still failing to produce a single parse tree.

Edit 3

My latest attempted solution works in every case except for giving negation a higher precedence. Any idea?

?wff: compound_wff
compound_wff: biconditional_wff
biconditional_wff: conditional_wff (SPACE? BICONDITIONAL_SYMBOL SPACE? conditional_wff)*
conditional_wff: disjunctive_wff (SPACE? CONDITIONAL_SYMBOL SPACE? disjunctive_wff)*
disjunctive_wff: conjunctive_wff (SPACE? DISJUNCTION_SYMBOL SPACE? conjunctive_wff)*
conjunctive_wff: negated_wff (SPACE? CONJUNCTION_SYMBOL SPACE? negated_wff)*
negated_wff: (NEGATION_SYMBOL SPACE?)* atomic_wff
atomic_wff: predicate
          | term EQUAL_TO term
          | quantified_wff* LEFT_PARENTHESIS SPACE? wff SPACE? RIGHT_PARENTHESIS
term: function
    | NAME
    | VARIABLE
predicate: PREDICATE_NAME [LEFT_PARENTHESIS term (COMMA term)* RIGHT_PARENTHESIS]
function: FUNCTION_NAME LEFT_PARENTHESIS term (COMMA term)* RIGHT_PARENTHESIS
quantified_wff: curly_quantifiers | quantifiers
curly_quantifiers: quantifier_symbol LEFT_CURLY_BRACE VARIABLE (COMMA SPACE? VARIABLE)* RIGHT_CURLY_BRACE SPACE?
quantifiers: quantifier_symbol SPACE? VARIABLE (COMMA SPACE? VARIABLE)* SPACE?

SPACE: /\s+/
COMMA: /,\s*/
EQUAL_TO: /\s*=\s*/
LEFT_PARENTHESIS: "("
RIGHT_PARENTHESIS: ")"
LEFT_CURLY_BRACE: "{"
RIGHT_CURLY_BRACE: "}"
quantifier_symbol: UNIVERSAL_QUANTIFIER_SYMBOL | EXISTENTIAL_QUANTIFIER_SYMBOL
UNIVERSAL_QUANTIFIER_SYMBOL: "\\forall" | "∀"
EXISTENTIAL_QUANTIFIER_SYMBOL: "\\exists" | "∃"
NAME: /[a-t]/ | /[a-t]_[1-9]\d*/
VARIABLE: /[u-z]/ | /[u-z]_[1-9]\d*/
PREDICATE_NAME: /[A-HJ-Z]/ | /[A-HJ-Z]_[1-9]\d*/
FUNCTION_NAME: /[a-z]/ | /[a-z]_[1-9]\d*/
NEGATION_SYMBOL: "\\neg" | "\\lnot" | "¬"
CONJUNCTION_SYMBOL: "\\wedge" | "\\land" | "∧"
DISJUNCTION_SYMBOL: "\\vee" | "\\lor" | "∨"
CONDITIONAL_SYMBOL: "\\rightarrow" | "\\Rightarrow" | "\\Longrightarrow" | "\\implies" | "→" | "⇒"
BICONDITIONAL_SYMBOL: "\\leftrightarrow" | "\\iff" | "↔" | "⇔"

%ignore SPACE
%ignore COMMA
%ignore LEFT_CURLY_BRACE
%ignore RIGHT_CURLY_BRACE

So it fails to to produce a single parse tree for ¬P(a) ∧ Q(b).


Solution

  • The issue was each wff rule was not matching the antecedent rule in it's parts. So:

    ?biconditional_wff: conditional_wff (SPACE? BICONDITIONAL_SYMBOL SPACE? biconditional_wff)*
    

    Should be changed to:

    ?biconditional_wff: conditional_wff (SPACE? BICONDITIONAL_SYMBOL SPACE? conditional_wff)*
    

    The solution to this issue is to have a grammar like this:

    ?wff: compound_wff
    compound_wff: biconditional_wff
    biconditional_wff: conditional_wff (SPACE? BICONDITIONAL_SYMBOL SPACE? conditional_wff)*
    conditional_wff: disjunctive_wff (SPACE? CONDITIONAL_SYMBOL SPACE? disjunctive_wff)*
    disjunctive_wff: conjunctive_wff (SPACE? DISJUNCTION_SYMBOL SPACE? conjunctive_wff)*
    conjunctive_wff: negated_wff (SPACE? CONJUNCTION_SYMBOL SPACE? negated_wff)*
    negated_wff: (NEGATION_SYMBOL SPACE?)* atomic_wff
    atomic_wff: predicate
              | term EQUAL_TO term
              | quantified_wff* LEFT_PARENTHESIS SPACE? wff SPACE? RIGHT_PARENTHESIS
    term: function
        | NAME
        | VARIABLE
    predicate: PREDICATE_NAME [LEFT_PARENTHESIS term (COMMA term)* RIGHT_PARENTHESIS]
    function: FUNCTION_NAME LEFT_PARENTHESIS term (COMMA term)* RIGHT_PARENTHESIS
    quantified_wff: curly_quantifiers | quantifiers
    curly_quantifiers: quantifier_symbol LEFT_CURLY_BRACE VARIABLE (COMMA SPACE? VARIABLE)* RIGHT_CURLY_BRACE SPACE?
    quantifiers: quantifier_symbol SPACE? VARIABLE (COMMA SPACE? VARIABLE)* SPACE?
    
    SPACE: /\s+/
    COMMA: /,\s*/
    EQUAL_TO: /\s*=\s*/
    LEFT_PARENTHESIS: "("
    RIGHT_PARENTHESIS: ")"
    LEFT_CURLY_BRACE: "{"
    RIGHT_CURLY_BRACE: "}"
    quantifier_symbol: UNIVERSAL_QUANTIFIER_SYMBOL | EXISTENTIAL_QUANTIFIER_SYMBOL
    UNIVERSAL_QUANTIFIER_SYMBOL: "\\forall" | "∀"
    EXISTENTIAL_QUANTIFIER_SYMBOL: "\\exists" | "∃"
    NAME: /[a-t]/ | /[a-t]_[1-9]\d*/
    VARIABLE: /[u-z]/ | /[u-z]_[1-9]\d*/
    PREDICATE_NAME: /[A-HJ-Z]/ | /[A-HJ-Z]_[1-9]\d*/
    FUNCTION_NAME: /[a-z]/ | /[a-z]_[1-9]\d*/
    NEGATION_SYMBOL: "\\neg" | "\\lnot" | "¬"
    CONJUNCTION_SYMBOL: "\\wedge" | "\\land" | "∧"
    DISJUNCTION_SYMBOL: "\\vee" | "\\lor" | "∨"
    CONDITIONAL_SYMBOL: "\\rightarrow" | "\\Rightarrow" | "\\Longrightarrow" | "\\implies" | "→" | "⇒"
    BICONDITIONAL_SYMBOL: "\\leftrightarrow" | "\\iff" | "↔" | "⇔"
    

    Also remove the %ignore directives.

    %ignore SPACE
    %ignore COMMA
    %ignore LEFT_CURLY_BRACE
    %ignore RIGHT_CURLY_BRACE