Can ANTLR return Lines of Code when lexing?

I am trying use ANTLR to analyse a large set of code using full Java grammar. Since ANTLR needs to open all the source files and scan them, I am wondering if it can also return lines of code.

I checked API for Lexer and Parser, it seems they do not return LoC. Is it easy to instrument the grammar rule a bit to get LoC? The full Java rule is complicated, I don't really want to mess a large part of it.

Solution

If you have an existing ANTLR grammar, and want to count certain things during parsing, you could do something like this:

grammar ExistingGrammar;

// ...

@parser::members {
  public int loc = 0;
}

// ...

someParserRule
 : SomeLexerRule someOtherParserRule {loc++;}
 ;

// ...

So, whenever your oparser encounters a someParserRule, you increase the loc by one by placing {loc++;} after (or before) the rule.

So, whatever your definition of a line of code is, simply place {loc++;} in the rule to increase the counter. Be careful not to increase it twice:

statement
 : someParserRule {loc++;}
 | // ...
 ;

someParserRule
 : SomeLexerRule someOtherParserRule {loc++;}
 ;

EDIT

I just noticed that in the title of your question you asked if this can be done during lexing. That won't be possible. Let's say a LoC would always end with a ';'. During lexing, you wouldn't be able to make a distinction between a ';' after, say, an assignment (which is a single LoC), and the 2 ';'s inside a for(int i = 0; i < n; i++) { ... } statement (which wouldn't be 2 LoC).