I started writing another C-like language a few days ago and I've gotten stuck here.
The "pointers" rule seems to be colliding with the *
operator in the OP token and making the *
operator not recognized in the "expr" rule, and same for the &
operator with the "reference" rule. How can I fix this?
grammar C;
program
: (include | var_decl | boigaCall ';' | func_decl | typedef ';')*;
stmt
: if_stmt
| repeat_stmt
| var_decl
| var_change
| function_call ';'
| return_stmt ';'
| boigaCall ';'
| switch_stmt
| '{' stmt* '}';
if_stmt
: if_part else_part?;
if_part
: 'if' paren_expr stmt;
else_part
: 'else' stmt;
repeat_stmt
: 'repeat' '(' expr ')' stmt;
var_decl
: type name=ID ('=' expr)? ';';
var_change
: pointers? name=ID ('=' | VARIABLE_MODIFIER) expr ';';
func_decl
: (inline='inline')? type recursion? noturbo? name=ID '(' functionArgs ')' stmt;
functionArgs
: ((ID name=ID) (',' ID name=ID)*?)?;
recursion
: '!';
noturbo
: '?';
paren_expr
: '(' expr ')';
function_call
: ID '(' expr? (',' expr)* ')';
return_stmt
: 'return' expr?;
typedef
: 'typedef' structdef ID;
structdef
: 'struct' '{' (structelem ';')+ '}';
switch_stmt
: 'switch' paren_expr switch_chain;
switch_chain
: '{' case_block+ default_block? '}';
case_block
: 'case' expr ':' stmt* 'break' ';';
default_block
: 'default' ':' stmt*;
structelem
: typedName;
typedName
: ID name=ID;
expr
: pointers expr
| term
| expr OP expr
| cast expr
| '(' expr ')';
term
: ID | INT | HEX | BIN | FLOAT | STRING | boigaCall | sizeOf | function_call | reference ID;
sizeOf
: 'sizeof' '(' ID ')';
boigaCall
: '__boiga' '(' STRING (',' expr)* ')';
cast
: '(' type ')';
pointers
: '*'+;
reference
: '&';
type
: ID pointers?;
include
: '#include' (LIBRARY | STRING);
fragment DIGIT: [0-9];
fragment LETTER: [a-zA-Z];
fragment HEX_CHAR: [a-fA-F];
STRING : '"' (~'"'|'\\"')* '"';
LIBRARY : '<' [a-zA-Z.]* '>';
ID : (LETTER | '_')+ (LETTER | '_' | DIGIT)*;
INT : '-'? DIGIT+;
HEX : '0x' (DIGIT | HEX_CHAR)+;
BIN : '0b' ('0' | '1')+;
FLOAT : '-'? DIGIT+ '.' DIGIT+;
VARIABLE_MODIFIER : OP '=';
OP : '+' | '-' | '*' | '/' | '%' | '==' | '!=' | '<' | '<=' | '>' | '>=' | '&&' | '||' | '&' | '|' | '^' | '>>' | '<<';
COMMENT : SINGLE_COMMENT | BLOCK_COMMENT;
SINGLE_COMMENT: '//' .*? '\n';
BLOCK_COMMENT : '/*' .*? '*/';
WS: ([ \t\r\n] | COMMENT)+ -> skip;
I tried making *
and &
into their own token and using those tokens in the "pointers" and "reference" rule, but that only caused the &
and *
tokens to be seen as operators again, but not as pointers/reference anymore. I tested the "program" rule with var x = a*b;
and var x = a&b;
, which tests the rule that is not properly working.
If you just move '*'
out of OP
, everything works just fine. Your grammar created an implicit token when you used '*'
inside of the pointers
rule, so these *
's are always that token and never OP
.
When you create a token rule for this specific literal, Antlr tracks it down, and doesn't create its double. Therefore it allows the user to type either '*'
or POW
.
expr
: term
| pointers expr
| expr (OP | POW) expr
| cast expr
| '(' expr ')';
POW: '*';
OP: '+' | '-' | '/' | '%' | '==' | '!=' | '<' | '<=' | '>' | '>=' | '&&' | '||' | '&' | '|' | '^' | '>>' | '<<';
Input: a*b*c
Input: a***b