Search code examples
design-patternsinterpreter-pattern

Terminal and non terminal symbols


I am currently reading about the interpreter pattern and there it says something about Terminal and nonterminal symbols. So I went to wikipedia and read about it. But I still don't understand what a terminal and nonterminal symbol is. Can you give me some examples with programming. I don't want the wikipedia page copy-pasted... I want real world examples.


Solution

  • Every language — be it a human language, computer language, or scientific notation — is made up of symbols. These symbols are the smallest meaningful units of the language, e.g. human languages have words and punctuation; maths has numbers and operators; Java has identifiers, literals, and symbols. These fundamental symbols are the terminals, the smallest parts that you can break statements into.

    Theses terminals are strung together to make statements, and languages have specific ways that you can string non-terminals together to convey meaning e.g. in maths, you can join a number, an operand, and another number (e.g. 5 + 2) to make an operation, and in Java you can join an identifier, an equals-sign symbol, a number literal, and a semicolon symbol (e.g. size = 50 ;) to make an assignment statement. These allowed ways of combining terminals are called non-terminals. Non-terminals can be combinations of terminals, other non-terminals, or both.

    So basically, terminals are the symbols of a language, while non-terminals are the combinations of those terminals to make expressions and statements. As an example, a simple programming language may have terminals like if, else, while, +, -, (, ), ++, --, '\n', 3.14159, and many more. These terminals can be combined into non-terminals such as:

    • Assignment statement
    • If statement
    • While loop
    • Variable declaration
    • and so on

    What the interpreter pattern is that it takes those terminal and non-terminal symbols and defines them as classes or data-types that match the structure of a language. So for terminal symbols like numbers, or identifiers, they could be defined like

    abstract class Symbol {
        abstract int evaluate();
    }
    
    class Number extends Symbol {
        int value;
    
        @Override int evaluate() {
            return value;
        }
    }
    
    class Identifier extends Symbol {
        String name;
    
        @Override int evaluate() {
            return getIdentifierValue(name);
        }
    }
    

    and then non-terminal symbols like if statements or expressions could be defined like

    class IfStatement extends Symbol {
        Symbol condition;
        Symbol ifBranch;
        Symbol elseBranch;
    
        @Override int evaluate() {
            if (condition.evaluate() != 0)
                return ifBranch.evaluate();
            else
                return elseBranch.evaluate();
        }
    }
    
    class AddExpression extends Symbol {
        Symbol left;
        Symbol right;
    
        @Override int evaluate() {
            return left.evaluate() + right.evaluate();
        }
    }
    

    So you can see, terminals represent the values like numbers, identifiers and strings, and non-terminals represent the expressions and statements built up recursively from those terminals. I'm not that good at explaining theory, but I hope the practical examples here will help you understand.