Search code examples
ruby-on-railsparsingruby-ripper

Parse ruby code


I need help in one problem. I have a table with columns that contain some ruby code, like this: self.org_premium = self.volume / 12 * 0.1492 self.billing_premium = self.subscriber_premium + self.org_premium or employment_level == 'P' or vol_life.save. And now I want find methods in these strings, but some Rails methods, like save or nil? must be ignored. I used Ripper, but his method slice return only 1 param. Maybe you have some idea about this?


Solution

  • When you use Ripper to slice, e.g.:

    $ irb
    2.0.0p247 :001 > p Ripper.slice('def m(a) nil end', 'ident')
    

    To see what events are available, just evaluate the constants it refers to in the doc: EVENTS which are further broken down into PARSER_EVENTS, and SCANNER_EVENTS.

    $ irb
    2.0.0p247 :001 > require 'ripper'
    2.0.0p247 :002 > Ripper::EVENTS
     => [:BEGIN, :END, :alias, :alias_error, :aref, :aref_field, :arg_ambiguous, :arg_paren, :args_add, :args_add_block, :args_add_star, :args_new, :array, :assign, :assign_error, :assoc_new, :assoc_splat, :assoclist_from_args, :bare_assoc_hash, :begin, :binary, :block_var, :block_var_add_block, :block_var_add_star, :blockarg, :bodystmt, :brace_block, :break, :call, :case, :class, :class_name_error, :command, :command_call, :const_path_field, :const_path_ref, :const_ref, :def, :defined, :defs, :do_block, :dot2, :dot3, :dyna_symbol, :else, :elsif, :ensure, :excessed_comma, :fcall, :field, :for, :hash, :if, :if_mod, :ifop, :lambda, :magic_comment, :massign, :method_add_arg, :method_add_block, :mlhs_add, :mlhs_add_star, :mlhs_new, :mlhs_paren, :module, :mrhs_add, :mrhs_add_star, :mrhs_new, :mrhs_new_from_args, :next, :opassign, :operator_ambiguous, :param_error, :params, :paren, :parse_error, :program, :qsymbols_add, :qsymbols_new, :qwords_add, :qwords_new, :redo, :regexp_add, :regexp_literal, :regexp_new, :rescue, :rescue_mod, :rest_param, :retry, :return, :return0, :sclass, :stmts_add, :stmts_new, :string_add, :string_concat, :string_content, :string_dvar, :string_embexpr, :string_literal, :super, :symbol, :symbol_literal, :symbols_add, :symbols_new, :top_const_field, :top_const_ref, :unary, :undef, :unless, :unless_mod, :until, :until_mod, :var_alias, :var_field, :var_ref, :vcall, :void_stmt, :when, :while, :while_mod, :word_add, :word_new, :words_add, :words_new, :xstring_add, :xstring_literal, :xstring_new, :yield, :yield0, :zsuper, :CHAR, :__end__, :backref, :backtick, :comma, :comment, :const, :cvar, :embdoc, :embdoc_beg, :embdoc_end, :embexpr_beg, :embexpr_end, :embvar, :float, :gvar, :heredoc_beg, :heredoc_end, :ident, :ignored_nl, :int, :ivar, :kw, :label, :lbrace, :lbracket, :lparen, :nl, :op, :period, :qsymbols_beg, :qwords_beg, :rbrace, :rbracket, :regexp_beg, :regexp_end, :rparen, :semicolon, :sp, :symbeg, :symbols_beg, :tlambda, :tlambeg, :tstring_beg, :tstring_content, :tstring_end, :words_beg, :words_sep]
    2.0.0p247 :009 > Ripper::PARSER_EVENTS
     => [:BEGIN, :END, :alias, :alias_error, :aref, :aref_field, :arg_ambiguous, :arg_paren, :args_add, :args_add_block, :args_add_star, :args_new, :array, :assign, :assign_error, :assoc_new, :assoc_splat, :assoclist_from_args, :bare_assoc_hash, :begin, :binary, :block_var, :block_var_add_block, :block_var_add_star, :blockarg, :bodystmt, :brace_block, :break, :call, :case, :class, :class_name_error, :command, :command_call, :const_path_field, :const_path_ref, :const_ref, :def, :defined, :defs, :do_block, :dot2, :dot3, :dyna_symbol, :else, :elsif, :ensure, :excessed_comma, :fcall, :field, :for, :hash, :if, :if_mod, :ifop, :lambda, :magic_comment, :massign, :method_add_arg, :method_add_block, :mlhs_add, :mlhs_add_star, :mlhs_new, :mlhs_paren, :module, :mrhs_add, :mrhs_add_star, :mrhs_new, :mrhs_new_from_args, :next, :opassign, :operator_ambiguous, :param_error, :params, :paren, :parse_error, :program, :qsymbols_add, :qsymbols_new, :qwords_add, :qwords_new, :redo, :regexp_add, :regexp_literal, :regexp_new, :rescue, :rescue_mod, :rest_param, :retry, :return, :return0, :sclass, :stmts_add, :stmts_new, :string_add, :string_concat, :string_content, :string_dvar, :string_embexpr, :string_literal, :super, :symbol, :symbol_literal, :symbols_add, :symbols_new, :top_const_field, :top_const_ref, :unary, :undef, :unless, :unless_mod, :until, :until_mod, :var_alias, :var_field, :var_ref, :vcall, :void_stmt, :when, :while, :while_mod, :word_add, :word_new, :words_add, :words_new, :xstring_add, :xstring_literal, :xstring_new, :yield, :yield0, :zsuper] 
    2.0.0p247 :010 > Ripper::SCANNER_EVENTS
     => [:CHAR, :__end__, :backref, :backtick, :comma, :comment, :const, :cvar, :embdoc, :embdoc_beg, :embdoc_end, :embexpr_beg, :embexpr_end, :embvar, :float, :gvar, :heredoc_beg, :heredoc_end, :ident, :ignored_nl, :int, :ivar, :kw, :label, :lbrace, :lbracket, :lparen, :nl, :op, :period, :qsymbols_beg, :qwords_beg, :rbrace, :rbracket, :regexp_beg, :regexp_end, :rparen, :semicolon, :sp, :symbeg, :symbols_beg, :tlambda, :tlambeg, :tstring_beg, :tstring_content, :tstring_end, :words_beg, :words_sep] 
    

    The 'ident' is an event for method name definition in this case, and that event isn't really equivalent to a method that is called in the code.

    I'm not sure that Ripper would be the easiest way to parse out method names that are used. In addition, the ability for Ruby to handle calls handled by method_missing really make it difficult to see what could be interpreted.

    Like I said in the comments, there are several other ways to parse methods you might look into.

    You could even just make something similar with string operations/checking available methods, e.g.

    class A
      IGNORE = %w{save nil?}
    
      def find_possible_methods(s)
        s.split(/[\-\ ,\.\(\)\{\}\[\]]/).reject{|c| c =~ /[0-9\*\-\/\+\%\=\~].*/ || c.empty? || IGNORE.include?(c)}
      end
    
      def find_implemented_methods(s)
        (s.split(/[\-\ ,\.\(\)\{\}\[\]]/) & (methods + private_methods).collect(&:to_s)).reject{|c| IGNORE.include?(c)}
      end
    end
    

    Usage:

    a = A.new
     => #<A:0x007facb9a94be8> 
    a.find_possible_methods 'self.org_premium = self.volume / 12 * 0.1492 self.billing_premium = self.subscriber_premium + self.org_premium'
     => ["self", "org_premium", "self", "volume", "self", "billing_premium", "self", "subscriber_premium", "self", "org_premium"]