ANTLR4 breaking rules down for logical generic lines

This is a follow up question to this question answered perfectly by Bart

My goal is possibly to get specific lines for either "generic script lines" or "lines inside a function body", ideally discarding whitespace, but still get any lines outside of the <% and %> tags in bulk. I came up with a solution, but looking at the tree it just seems messy.

Here is my lexer:

lexer grammar CmScriptLexer;

//Whitespace:  Spaces -> channel(HIDDEN);
ScriptStart : '<%' (Spaces)* -> mode(Script);
SpacesPlain : [\r\n]+ -> skip;
GenericText : . ;

mode Script;

 ScriptEnd  : '%>' -> mode(DEFAULT_MODE);
 Comment    : '\'' ~[\r\n]* -> skip;
 Function   : 'function' -> mode(FunctionDeclaration);
 NL : [\r\n]+;
 ScriptText : . ;

mode FunctionDeclaration;
 FunctionComment    : '\'' ~[\r\n]* -> skip;
 FunctionName      : Id;
 DeclarationSpaces : Spaces+ -> skip;
 OPar              : '(' -> mode(FunctionParameter);

mode FunctionParameter;
 FunctionParameterComment    : '\'' ~[\r\n]* -> skip;
 ParameterName   : Id;
 ParameterSpaces : Spaces+ -> skip;
 Comma           : ',';
 CPar            : ')' -> mode(InFunction);

mode InFunction;
 FunctionBodyComment    : '\'' ~[\r\n]* -> skip;
 EndFunction    : 'end' Spaces 'function' -> mode(Script);
 FunctionLine : ~[ \r\n]+;
 FunctionSpaces : Spaces+;
 //FunctionText   : . ;

fragment Spaces : [ \r\n\t]+;
fragment Id     : [a-zA-Z0-9_\u0080-\ufffe]+;

and my parser:

parser grammar CmScriptParser;

options { tokenVocab=CmScriptLexer; }

file
 : block* EOF
 ;

block
 : plainText
 | ScriptStart script* ScriptEnd
 ;

plainText
 : GenericText+ NL*
 ;

script
 : simpleScript NL*
 | function NL*
 ;

simpleScript
 : ScriptText+ 
 ;

function
 : Function FunctionName OPar parameters? CPar functionBody EndFunction
 ;

functionBody
 : functionLines+
 ;

functionLines
 : FunctionSpaces* functionLine FunctionSpaces*
 ;

functionLine
 : FunctionLine+
 ;

parameters
 : ParameterName ( Comma ParameterName )*
 ;

and finally what I'm using as a test case:

foo

bar
<%
line 1


line 2 
 
function x(y)
  spanning
  multiple
  lines
end function

function a(b)    no newlines         end function


  %>     
baz

My issue is it seems really verbose and I fear my "solution" while with the test case is just poorly laid out and I'm maybe overthinking rules.

Any suggestion on how to improve? All I want is trimmed "line" elements so matching something like \n \n\n\tscript line \n\n\t\n being resulted in a line of just script line is ideal.

EDIT: adding what I think is an example of what I am after, again, maybe not expressing the best way possible:

simpleScript:
  scriptLine: line1
  scriptLine: line2
function: 
  name: x
  parameters:
     paramter: y
  body:
    functionLine: spanning
    functionLine: multiple
    functionLine: lines
function: 
  name: a
  parameters:
     paramter: b
  body:
    functionLine: no newlines

The goal in the end is when walking the tree, I can make a new "function call object", and call stuff like

script = new Script() // on script "enter"
script.addLine("line 1")
script.addLine("line 2")
program.addNode(script) // on script "exit"
...
function = new Function() // on function "enter"
function.setName("y") // on "function"?
...
function.addParameter("a") // on "parameter"
...
function.addBodyLine("spanning") // on "line" ??
function.addBodyLine("multiple")
function.addBodyLine("lines")
...
program.addFunctionDeclaration(function) // on function "exit" once complete

Solution

The problem is that inside a script, you cannot simply tell the grammar to match some non-space followed by everything except line breaks. Sure, that would match line 1, but that would also match function x(y) because the lexer matches greedily (it tries to consume as many characters as possible). You must therefor chop up the tokens on white spaces.

You could merge some single char tokens using ~[ \t\r\n]+, but you cannot create tokens that cause multiple words with spaces in between to be matched as single tokens.

Something like this:

lexer grammar CmScriptLexer;

ScriptStart : '<%' Spaces* -> mode(Script);
GenericText : ~[ \t\r\n]+;
TextSpaces  : Spaces -> skip;

mode Script;
 ScriptEnd   : '%>' -> mode(DEFAULT_MODE);
 Comment     : '\'' ~[\r\n]* -> skip;
 Function    : 'function' -> mode(FunctionDeclaration);
 NL          : [\r\n]+;
 ScriptText  : ~[ \t\r\n]+;
 SciptSpaces : Spaces -> skip;

mode FunctionDeclaration;
 FunctionComment   : '\'' ~[\r\n]* -> skip;
 FunctionName      : Id;
 DeclarationSpaces : Spaces+ -> skip;
 OPar              : '(' -> mode(FunctionParameter);

mode FunctionParameter;
 FunctionParameterComment : '\'' ~[\r\n]* -> skip;
 ParameterName            : Id;
 ParameterSpaces          : Spaces+ -> skip;
 Comma                    : ',';
 CPar                     : ')' -> mode(InFunction);

mode InFunction;
 FunctionBodyComment : '\'' ~[\r\n]* -> skip;
 EndFunction         : 'end' Spaces 'function' -> mode(Script);
 FunctionLine        : ~[ \t\r\n]+;
 FunctionSpaces      : Spaces+ -> skip;

fragment Spaces : [ \r\n\t]+;
fragment Id     : [a-zA-Z0-9_\u0080-\ufffe]+;