Search code examples
rubyparsingtreetop

Treetop parser : Function definition syntax - n arguments


I'm currently trying to describe some basic Ruby grammar but I'm now stuck with function definition. Indeed, I don't know how to handle 'n' argument. Here is the code I use to handle functions containing from 0 to 2 args :

  rule function_definition
    'def' space? identifier space? '(' space? expression? space? ','? expression? space? ')'
      block
    space? 'end' <FunctionDefinition>
  end  

How could I do to handle 'n' argument ? Is there any recursive way to do that ?

EDIT :

I wanted to highlight the fact that I need the arguments to be in the result tree. Like :

 Argument offset=42, "arg1"
 Argument offset=43, "arg2"
 Argument offset=44, "arg3"

So I need to do a cstom SyntaxNode Subclass declaration, just like I did for function_definition rule for instance.


Solution

  • You want something like (untested):

    'def' space? identifier space? '(' space? ( expression ( space? ',' expression )* )? space?  ')'
    

    (NB if this is a ruby style def then the parens are also optional in the case where there are no arguments)

    Edit to demonstrate extracting the arguments from the parse tree -- here I spit out the text_value of each argument (FunctionArg) syntax node but you could of course do anything:

    foo.rb:

    # Prepend current directory to load path
    $:.push('.')
    
    # Load treetop grammar directly without compilation
    require 'polyglot'
    require 'treetop'
    require 'def'
    
    # Classes for bespoke nodes
    class FunctionDefinition < Treetop::Runtime::SyntaxNode ; end
    class FunctionArg < Treetop::Runtime::SyntaxNode ; end
    
    # Some tests
    [
      'def foo() block end',
      'def foo(arg1) block end',
      'def foo(arg1, arg2) block end',
      'def foo(arg1, arg2, arg3) block end',
    ].each do |test|
      parser = DefParser.new
      tree = parser.parse( test )
      raise RuntimeError, "Parsing failed on line:\n#{test}" unless tree
      puts test
      puts "identifier=#{tree.function_identifier}"
      puts "args=#{tree.function_args.inspect}"
      puts
    end
    

    def.tt:

    grammar Def
    
      # Top level rule: a function
      rule function_definition
        'def' space identifier space? '(' space? arg0 more_args space? ')' space block space 'end' <FunctionDefinition>
        {
          def function_identifier
            identifier.text_value
          end
          def function_args
            arg0.is_a?( FunctionArg ) ? [ arg0.text_value ] + more_args.args : []
          end
        }
      end
    
      # First function argument
      rule arg0
        argument?
      end
    
      # Second and further function arguments
      rule more_args
        ( space? ',' space? argument )* 
        {
          def args
            elements.map { |e| e.elements.last.text_value }
          end
        }
      end
    
      # Function identifier
      rule identifier
        [a-zA-Z_] [a-zA-Z0-9_]*
      end
    
      # TODO Dummy rule for function block
      rule block
        'block'
      end
    
      # Function argument
      rule argument
        [a-zA-Z_] [a-zA-Z0-9_]* <FunctionArg>
      end
    
      # Horizontal whitespace (htab or space character).
      rule space
        [ \t]
      end
    
    end
    

    Output:

    def foo() block end
    identifier=foo
    args=[]
    
    def foo(arg1) block end
    identifier=foo
    args=["arg1"]
    
    def foo(arg1, arg2) block end
    identifier=foo
    args=["arg1", "arg2"]
    
    def foo(arg1, arg2, arg3) block end
    identifier=foo
    args=["arg1", "arg2", "arg3"]