Search code examples
javatokenizeabstract-syntax-tree

Creating a syntax tree from tokens


I'm trying to create a tiny interpreter for TI-BASIC syntax.

This is a snippet of TI-BASIC I'm trying to interpret

A->(2+(3*3))

I've tokenized the code above into this sequence of tokens:

Token{type=VARIABLE, content='A'}
Token{type=ASSIGN, content='null'}
Token{type=L_PAREN, content='null'}
Token{type=NUM, content='2'}
Token{type=ADD, content='null'}
Token{type=L_PAREN, content='null'}
Token{type=NUM, content='3'}
Token{type=MULT, content='null'}
Token{type=NUM, content='3'}
Token{type=R_PAREN, content='null'}
Token{type=R_PAREN, content='null'}
Token{type=EOS, content='null'} (end of statement)
Token{type=EOF, content='null'} (end of file)

If I'm not mistaken, I think the next step from here is to represent these tokens as a tree of statements (Abstract Syntax Tree?)

 Assignment (->)
    / \
   /   \
  A    Add
       /\
      /  \
     2  Multiply
           /\
          /  \
         3    3

I'm wondering how I should go about creating this tree, or if that's even the correct thing to do. Thanks!


Solution

  • The simple answer is that you need to write a parser for TI-Basic. You've already written your lexer (lexical analyzer) and now you need to write your syntax analyzer.

    There are many ways of doing this, but the Wikipedia page on parsers is a good place to start: examples of parsers.