Search code examples
pythoncompiler-constructioninterpretercontext-free-grammarply

Resolve shift/reduction conflict in grammar for expressions in PLY for calls to embedded functions


I'm creating a grammar for an interpreter in PLY. At the moment I'm trying to implement embedded function calls as expressions, but I'm having a problem seeing where the conflict is for the following rules.

The situation is as follows, in the language there are certain "native" operations for objects of type vector, as well as certain functions for some data types. For example, for vectors there are the functions pop(), push(), join(), etc.

For example:

vect1.pop()
vect1.join()

So I have the following rules:

expresion :     ID PUNTO POP PAREN_APERTURA PAREN_CIERRE
              | ID PUNTO INDEX_OF PAREN_APERTURA expresion PAREN_CIERRE
              | ID PUNTO JOIN PAREN_APERTURA PAREN_CIERRE

There are also certain functions that apply to certain types of expressions, such as toString().

For example:

console.log(12.5.toString())
var num1 : number = 20;
console.log(num1.toString())

So I added the following rules:

expresion : expresion PUNTO funcion_emb PAREN_APERTURA PAREN_CIERRE

funcion_emb     : TO_STRING
                | TO_UPPER_CASE
                | TO_LOWER_CASE

Additionally, there are other functions that are used to convert values, such as:

parseInt("12")
parseFloat("24.5")

With the following rules:

expresion:      PARSE_INT PAREN_APERTURA expresion PAREN_CIERRE
              | PARSE_FLOAT PAREN_APERTURA expresion PAREN_CIERRE
              | TYPE_OF expresion

The problem is that it generates a warning of a shift/reduce conflict, but it doesn't show me what rule is generating it.

Looking at the grammar, it may be that the following rules are generating the conflict

expresion:      expresion PUNTO funcion_emb PAREN_APERTURA PAREN_CIERRE
              | ID PUNTO POP PAREN_APERTURA PAREN_CIERRE
              | ID PUNTO INDEX_OF PAREN_APERTURA expresion PAREN_CIERRE
              | ID PUNTO JOIN PAREN_APERTURA PAREN_CIERRE
              | PARSE_INT PAREN_APERTURA expresion PAREN_CIERRE
              | PARSE_FLOAT PAREN_APERTURA expresion PAREN_CIERRE
              | TYPE_OF expresion

Here are the corresponding rules for expressions and precedence:

precedence = (
    ('left', 'OR'),
    ('left', 'AND'),
    ('left', 'IGUAL', 'DIFERENTE'),
    ('left', 'MENOR_QUE', 'MAYOR_QUE', 'MENOR_IGUAL', 'MAYOR_IGUAL'),
    ('left', 'MAS', 'MENOS'),
    ('left', 'MULTI', 'DIV', 'MOD'),
    ('right', 'NOT', 'UMENOS'),
    ('nonassoc', 'TYPE_OF'),    
)

expresion :     expresion MAS expresion
              | expresion MENOS expresion
              | expresion MULTI expresion
              | expresion DIV expresion
              | expresion MOD expresion
              | expresion IGUAL expresion
              | expresion DIFERENTE expresion
              | expresion MENOR_QUE expresion
              | expresion MAYOR_QUE expresion
              | expresion MENOR_IGUAL expresion
              | expresion MAYOR_IGUAL expresion
              | expresion AND expresion
              | expresion OR expresion
              | PAREN_APERTURA expresion PAREN_CIERRE
              | MENOS expresion %prec UMENOS
              | NOT expresion              
              | expresion PUNTO funcion_emb PAREN_APERTURA PAREN_CIERRE
              | ID PUNTO POP PAREN_APERTURA PAREN_CIERRE
              | ID PUNTO INDEX_OF PAREN_APERTURA expresion PAREN_CIERRE
              | ID PUNTO JOIN PAREN_APERTURA PAREN_CIERRE
              | PARSE_INT PAREN_APERTURA expresion PAREN_CIERRE
              | PARSE_FLOAT PAREN_APERTURA expresion PAREN_CIERRE
              | TYPE_OF expresion
              | literal
              | ID 


funcion_emb : TO_STRING
                | TO_UPPER_CASE
                | TO_LOWER_CASE

literal : LIT_NUMBER
            | LIT_FLOAT
            | LIT_CHAR
            | LIT_BOOLEAN
            | LIT_STRING

Any suggestions?

UPDATE:

Indeed, there were conflicts due to productions with ID PUNTO. Now, I'm adding new productions to access object properties.

expresion : acceso_atrib

acceso_atrib : acceso_atrib PUNTO expresion

acceso_atrib : expresion PUNTO expresion

And I have specified the precedence and associativity of the POINT operator in the grammar.

precedence = (
    ('left', 'OR'),
    ('left', 'AND'),
    ('left', 'IGUAL', 'DIFERENTE'),
    ('left', 'MENOR_QUE', 'MAYOR_QUE', 'MENOR_IGUAL', 'MAYOR_IGUAL'),
    ('left', 'MAS', 'MENOS'),
    ('left', 'MULTI', 'DIV', 'MOD'),
    ('right', 'NOT', 'UMENOS'),
    ('nonassoc', 'TYPE_OF'),
    ('left', 'PUNTO'), # ADDED
)

However, a conflict still occurs in the following rule:

WARNING: shift/reduce conflict for PUNTO in state 43 resolved as shift


state 43

    (74) expresion -> acceso_atrib .
    (75) acceso_atrib -> acceso_atrib . PUNTO expresion

  ! shift/reduce conflict for PUNTO resolved as shift
    MAS             reduce using rule 74 (expresion -> acceso_atrib .)
    MENOS           reduce using rule 74 (expresion -> acceso_atrib .)
    MULTI           reduce using rule 74 (expresion -> acceso_atrib .)
    DIV             reduce using rule 74 (expresion -> acceso_atrib .)
    MOD             reduce using rule 74 (expresion -> acceso_atrib .)
    IGUAL           reduce using rule 74 (expresion -> acceso_atrib .)
    DIFERENTE       reduce using rule 74 (expresion -> acceso_atrib .)
    MENOR_QUE       reduce using rule 74 (expresion -> acceso_atrib .)
    MAYOR_QUE       reduce using rule 74 (expresion -> acceso_atrib .)
    MENOR_IGUAL     reduce using rule 74 (expresion -> acceso_atrib .)
    MAYOR_IGUAL     reduce using rule 74 (expresion -> acceso_atrib .)
    AND             reduce using rule 74 (expresion -> acceso_atrib .)
    OR              reduce using rule 74 (expresion -> acceso_atrib .)
    PUNTO_COMA      reduce using rule 74 (expresion -> acceso_atrib .)
    PAREN_CIERRE    reduce using rule 74 (expresion -> acceso_atrib .)
    COMA            reduce using rule 74 (expresion -> acceso_atrib .)
    CORCHETE_CIERRE reduce using rule 74 (expresion -> acceso_atrib .)
    DOS_PUNTOS      reduce using rule 74 (expresion -> acceso_atrib .)
    PUNTO           shift and go to state 87

  ! PUNTO           [ reduce using rule 74 (expresion -> acceso_atrib .) ]

Solution

  • The best way to debug these things is to call yacc.yacc(debug=True) and look at the file parser.out that is generated.

        (18) expresion -> ID . PUNTO POP ( )
        (19) expresion -> ID . PUNTO INDEX_OF ( expresion )
        (20) expresion -> ID . PUNTO JOIN ( )
        (25) expresion -> ID .
    

    You have read an ID and are now about to read a period. The grammar isn't sure whether to just shift the period, or to convert the ID to an expression. (The latter is a possibility since expression . function_emb ( ) is in your grammar.

    The fix is that ID PUNTO ... should never appear in your grammar. In all of those, it should be expression PUNTO ... since pop(), etc can be called on any expression