Here is a simple grammar:
START = DECL DECL $ ;
DECL = TYPE NAME '=' VAL ;
TYPE = 'int' | 'float' ;
NAME = 'a' | 'b' ;
VAL = '4' ;
I parse this input stream with Grako:
int a = 4
float b = 4
and I retrieve this abstract syntax tree (JSON):
[
"int",
"a",
[
"=",
"4"
],
[
"float",
"b",
[
"=",
"4"
]
]
]
Is there a simple way to obtain ASTs like this:
[
"int" TYPE,
"a" NAME,
[
"=" DECL,
"4" VAL
],
[
"float" TYPE,
"b" NAME,
[
"=" DECL,
"4" VAL
]
]
]
or this:
...
"int TYPE",
...
?
I believe semantic actions in the Grako generated parser is the solution, but I can't figure it out.
Is there a simple way to do this ?
The output format you propose is not JSON-compatible, and it's not Python. By using Grako's features for AST customization you can obtain output that can be processed in Python and in any other language that has a JSON library.
Modify the grammar by adding an AST name to the elements of interest, like this:
START = DECL DECL $ ;
DECL = TYPE:TYPE NAME:NAME '=' VAL:VAL ;
TYPE = 'int' | 'float' ;
NAME = 'a' | 'b' ;
VAL = '4' ;
And you'll obtain output like this:
AST:
[AST({'NAME': 'a', 'VAL': '4', 'TYPE': 'int'}), AST({'NAME': 'b', 'VAL': '4', 'TYPE': 'float'})]
JSON:
[
{
"TYPE": "int",
"NAME": "a",
"VAL": "4"
},
{
"TYPE": "float",
"NAME": "b",
"VAL": "4"
}
]
The resulting AST is easy to process into whichever final output you need.