Search code examples
arraysantlr4lexer

Trouble in defining Array in ANTLR4


Im implementing a simple language lexer and parser, and I got stuck at defining array for this language. So the requirement is to write grammar for a multi-dimensional arrays so that:

  • All elements of an array must have the same type which can be number, string,boolean.

  • In an array declaration, a number literals must be used to represent the number (or the length) of one dimension of that array. If the array is multi-dimensional, there will be more than one number literals. These literals will be separated by comma and enclosed by square bracket ([ ]).

For example:

number a[5] <- [1, 2, 3, 4, 5]

or

number b[2, 3] <- [[1, 2, 3], [4, 5, 6]]

An array value is a comma-separated list of literals enclosed in ’[’ and ’]’. The literal elements are in the same type. For example, [1, 5, 7, 12] or [[1, 2], [4, 5], [3, 5]].

grammar ABC;

@lexer::header {
from lexererr import *
}

options {
    language=Python3;
}
program : arraydecl+ EOF;
INT: [0-9]+;
arraydecl : prim_type ID (ARRAY_START arrayliteral_list  ARRAY_END) (ASSIGN arrayliteral_list)?;
arrayliteral : ARRAY_START INT (COMMA INT)* ARRAY_END ; // ex: [1,2,3,4]
arrayliteral_list: ARRAY_START  (arrayliteral (COMMA arrayliteral)*) ARRAY_END; 
prim_type: NUMBER | BOOL | STRING;


NUMBER: 'number';
BOOL: 'bool';
STRING: 'string';
ASSIGN : '<-';
EQ : '=';
ARRAY_START : '[';
ARRAY_END : ']';
LP : '(';
RP : ')';
COMMA : ',';
SEMI : ';';
TYPES: ('number' | 'string' | 'boolean');
prim_types: TYPES;
ID: [A-Za-z_][A-Za-z0-9_]*;
// newline character
NEWLINE: '\n' | '\r\n';
/* COMMENT */
LINECMT : '##' ~[\n\r]*;

WS : [ \t\r\n\f\b]+ -> skip ; // skip spaces, tabs, newlines
ERROR_CHAR: . {raise ErrorToken(self.text)};
UNCLOSE_STRING: . {raise UncloseString(self.text)};


This is my code, and it does not work as i expected Even for the simple testcase like this:

def test_simple_program(self):
        """Test array declaration """
        input = """number a[5]
        """
        expect = "successful"
        self.assertTrue(TestParser.test(input,expect,204))

It returns : "Error on line 1 col 9: 5" \

any help will be greatly appreciated


Solution

  • You need to recursively use an arrayliteral: such a literal contains zero or more expressions. An expression can be an arrayliteral.

    Something like this:

    program
     : arraydecl+ EOF
     ;
    
    arraydecl
     : prim_type ID ARRAY_START expression ARRAY_END (ASSIGN arrayliteral)?
     ;
    
    prim_type
     : NUMBER | BOOL | STRING
     ;
    
    prim_types
     : TYPES
     ;
    
    expression
     : arrayliteral
     | ID
     | INT
     ;
    
    arrayliteral
     : ARRAY_START expressions? ARRAY_END
     ;
    
    expressions
     : expression (',' expression)*
     ;
    

    enter image description here