I am trying to lex/parse an existing language by using XText. I have already decided I will use a custom ANTLRv3 lexer, as the language cannot be lexed in a context-free way. Fortunately, I do not need parser information; just the previously encountered tokens is enough to decide the lexing mode.
The target language has an InputSection that can be described as follows: InputSection: INPUT_SECTION A=ID B=ID;
. However, it can be specified in two different ways.
; The canonical way
$InputSection Foo Bar
$SomeOtherSection Fonzie
; The haphazard way
$InputSection Foo
$SomeOtherSection Fonzie
$InputSection Bar
Could I use TokenStreamRewriter to reorder all tokens in the canonical way, before passing this on to the parser? Or will this generate issues in XText later?
After a lot of investigation, I have come to the conclusion that editor tools themselves are truly not fit for this type of problem.
If you would start typing on one rule, you would have to take into account the AST context from subsequent sections to know how to auto-complete. At the same time, this will be very confusing for the user.
In the end, I will therefore simply not support this obscure feature of the language. Instead, the AST will be constructed so that a section (reasonably) divided between two parts will still parse correctly.