Are there any common solutions how to use incomplete grammars? In my case I just want to detect methods in Delphi (Pascal)-files, that means procedures
and functions
. The following first attempt is working
: ( procedure | function | . )+
but is that a solution at all? Are there any better solutions? Is it possible to stop parsing with an action (e. g. after detecting implementation
). Does it make sense to use a preprocessor? And when yes - how?
If you're only looking for names, then something as simple as this:
grammar PascalFuncProc;
: (Procedure | Function)* EOF
: 'procedure' Spaces Identifier
: 'function' Spaces Identifier
: (StrLiteral | Comment | .) {skip();}
fragment Spaces : (' ' | '\t' | '\r' | '\n')+;
fragment Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*;
fragment StrLiteral : '\'' ~'\''* '\'';
fragment Comment : '{' ~'}'* '}';
will do the trick. Note that I am not very familiar with Delhpi/Pascal, so I am surely goofing up StrLiteral
s and/or Comment
s, but that'll be easily fixed.
The lexer generated from the grammar above will only produce two type of tokens (Procedure
s and Function
s), the rest of the input (string literals, comments or if nothing is matched, a single character: the .
) is being discarded from the lexer immediately (the skip()
For input like this:
some valid source
function NotAFunction ...
procedure Proc
procedure Func
s = 'function NotAFunction!!!'
the following parse tree is created: