Search code examples
c++memory-managementantlr4abstract-syntax-tree

Antlr4 allocate ParseTree on heap


I have a function like this to get the the AST from a file.

antlr4::tree::ParseTree *get_ast(std::string &filename) {
    std::ifstream stream;
    stream.open(filename);
    antlr4::ANTLRInputStream input(stream);
    Lexer lexer(&input);
    antlr4::CommonTokenStream tokens(&lexer);
    Parser parser(&tokens);
    antlr4::tree::ParseTree *tree = parser.program();
    return tree;
}

But when using the return value, it seems that what tree is pointing to is already cleared (on the stack), and I need to know how to allocate the tree on the heap, so I can use the return value (and manually free).

EDIT: based on @sepp2k comment, I tried keeping the parser alive by heap allocating.

Parser *get_parser(std::string filename) {
    std::ifstream stream;
    stream.open(filename);
    antlr4::ANTLRInputStream input(stream);
    Lexer lexer(&input);
    antlr4::CommonTokenStream tokens(&lexer);
    return new Parser(&tokens);

}

However, this now gives segmentation fault in Parser.cpp generated file when I call a rule from the parser


Solution

  • You not only have to keep the parser alive but also the token stream, because the parse tree uses token references.

    I recommend to create a wrapper class holding all the parser related objects and keep that alive. This way all the references stay valid. You can always re-use the object for new parse runs.

    For MySQL Workbench I created a parser context which provides all parsing functionality for the application. Use this as a template for your implementation.