I wrote a simple compiler in C++ and want to visualize the Abstract Syntax Tree it produces. Currently, I dump the AST in a super long string similar to something below:
Program(decls=[ConstDecl(type=BasicTypeKind::Int, value=Num(n=1, loc=Location(1, 21)), name=positive, loc=Location(1, 10)), ConstDecl(type=BasicTypeKind::Int, value=Num(n=-1, loc=Location(2, 21)), name=negative, loc=Location(2, 10)), ConstDecl(type=BasicTypeKind::Int, value=Num(n=100, loc=Location(3, 26)), name=max_heap_size, loc=Location(3, 10)), ConstDecl(type=BasicTypeKind::Character, value=Char(c=99, loc=Location(4, 23)), name=...
As you can see, this dump isn't very human friendly in terms of visualization. One cannot naturally relate the notion a tree to such a long string. I tried the approach to pretty-print the AST and found astpretty, which is for Python. It aims a lot in debugging but what if I want an illustration of the AST? A graphical format surely fits better.
Actually I have the picture about what the output I am looking forward. Graphviz
does a great job on this field and various graphs the C++ document tool Doxygen
generates are very close to my purpose conceptually.
Putting these together, I want a way to turn an AST in memory as C++ objects into decent graphic output (static one is ok). Any good starting point?
Edit: as comments said, dumping my AST in format Graphviz
recognizes is a good starting point. I will try do it this way until new and more concrete problems arise. Thanks guys.
I found a solution that is both fast and acceptable. LLVM-8 includes an option to visualize a Control Flow Graph (CFG) by generating a dot format file from it. Basically this works for more general graph structures as long as you specialize the llvm::GraphTraits
and llvm::DOTGraphTraits
templates (with a little work around) and the llvm::WriteGraph()
will work for you. Here is result without too much fine-tuning:
given the C-like snippet:
const int IntConstant = 1;
int Array[2];
void main() {
Printf("hello");
}
This language is a simplification of C and it is case-insensitive. A Printf()
is represented by a Write
statement as the node in the image. ConstDecl
means constant declaration. VarDecl
means variable declaration and FuncDef
means function definition. Other things are pretty much straightforward.
Before explaining the magic, I would like to point you to these documents that really help how to use the llvm APT involved.
WriteGraph
requires of your code.GraphTraits
Now with this knowledge bases, you just need to put them together. In fact, follow these steps:
nodes_iterator
implementation.ChildIteratorType
implementation.getNodeLabel()
and getNodeDescription()
of your DOTGraphTraits
.NodeRef
type of GraphTraits
must be a pointer! knowing this beforehand can save you a day. This is true for the latest version. Maybe in the future they will remove the restriction.My current implementation only include the class name of a node as its label and a self-explanatory string as its description without any fine-tuning. If you want more effects, i.e., color, shape, etc., you can do it in your DOTGraphTraits
.