Search code examples
clex

what should the output of a lexer be in c?


#include<stdio.h>

int main()
{
  int a,b;
  a=a+b;
  printf("%d",a);
return 0;
}

what should be the output if this code is passed through a lexer


Solution

  • the lexer just tokenizes the stream to turn a stream of characters into a stream of tokens (that will be parsed with a parser later to obtain a full syntax tree). For your example you would obtain something like:

    #include <stdio.h> (this is handled by preprocessor, not by lexer so it wouldn't exist)
    
    int KEYWORD
    main IDENTIFIER
    ( LPAR
    ) RPAR
    { LBRACE
    int KEYWORD
    a IDENT
    , COMMA
    b IDENT
    ; SEMICOL
    a IDENT
    = ASSIGN
    a IDENT
    + PLUS
    b IDENT
    ; SEMICOL
    printf IDENT
    ( LPAR
    "%d" STRING
    , COMMA
    a IDENT
    ) RPAR
    ; SEMICOL
    return RETURN_KEYWORD
    0 INTEGER
    ; SEMICOL
    } RBRACE
    

    Of course a lexer by itself can't do much, it can just split the source into smallest elements possible, checking for syntax errors (like misspelled keywords). You will need something that will combine them to give them a semantic meaning.

    Just a side note: some lexers like to group similar kinds of tokens in just one (for example a KEYWORD token that contains all keywords) using a parameter associated with it, while others have a different token for every one like RETURN_KEYWORK, IF_KEYWORD and so on..