Search code examples
javascriptparsingsyntax-highlightingatom-editorcodemirror-modes

How does atom text editor parse / tokenise code? (syntax-highlighting)


So CodeMirror uses modes to tokenise its code.
It breaks up the document into lines and makes each line a stream, which is then put through into the pre-defined mode. It can span multiple lines by using its state parameter.
It seems ACE has a similar method.

Neither of these methods use RegExp inherently (but obviously whomever creates the mode can code in RegExp into their mode).

From what I've read of Atom's code and style, is that it calls different syntax highlighters grammars and they resemble closely the grammars from TextMate. These grammars resemble JSON objects which contain classnames and RegExps (see how to write a TextMate grammar).

I can't figure out for the life of me how exactly Atom Text Editor actually performs the parsing of code, keeping its state and also extending through various scopes.

If someone could point me in the right direction that would be great.


Solution

  • The question was answered here.

    Atom uses its first-mate module, which relies on oniguruma for parsing Regular Expressions.