Search code examples
pythonparser-generator

Advice on Python Parser Generators


I've been given a task where I have to create a parser for a simple C-like language. I can use any programming language and tools I wish to create the parser, but I'm learning Python at the same time so it would be my preferred choice.

There are a few restrictions my Parser has to follow. Firstly, it must be able to read in a text file that contains the following information:

kind1 : spelling1
kind2 : spelling2
kind3 : spelling3
      .
      .
      .
kindn : spellingn

Where each kind and spelling refer to the token type and value of the language. This file is the result of putting a sample of code through the language's lexical analyser.

Secondly, I must be able to customise the output of the parser. Ideally I would like to output a file that has converted the kind:spelling list into another sequence of tokens that would be passed to the language's compiler to be converted into MIPS Assembly code. Here's a little example of the kind of thing I would like the parser to be able to produce:

%function int test
  %variable int x
  %variable int y
%begin
  %if %id y , %id x > %do
  %begin
    %return %num 0
  %end
  %return %num 1
%end

It would be a great help if someone could advise me on existing Python Parser Generators and if I'd be able to achieve the sort of thing I'm looking for in the above examples.


Solution

  • PyParsing is a python tool to generate parsers. There are a lot of interesting examples.

    Easy to get started:

    from pyparsing import Word, alphas
    
    # define grammar
    greet = Word( alphas ) + "," + Word( alphas ) + "!"
    
    # input string
    hello = "Hello, World!"
    
    # parse input string
    print hello, "->", greet.parseString( hello )