Search code examples
visual-studio-codemultilinemultilinestringtmlanguage

How to detect multiline string when the start and end is a keyword?


I'm trying to get develop a syntax highlighting for such a text:

PAGE_TEXT_PARAGRAPH_START (HN 10 JUSTIFIED 0.0 4.0)
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed sagittis varius nulla,
nec rhoncus erat vehicula eget. Sed suscipit efficitur hendrerit. Morbi sed interdum
nunc. Mauris id purus nisi. Nunc elementum sem id dolor consequat maximus. Phasellus
vitae lacus et orci luctus condimentum.
PAGE_TEXT_PARAGRAPH_END

Basically the 2 words "PAGE_TEXT_PARAGRAPH_START" and "PAGE_TEXT_PARAGRAPH_END" are keywords, that I've defined in my tmLanguage.json like that:

"keywords": {
    "patterns": [{
        "name": "keyword.control.myprecious",
        "match": "(?i)\\bPAGE_PARAGRAPH_(START|END)\\b"
    }]
}

That works, I get the correct highligting color of both keywords.

Now the text between these 2 keywords, should be considered a string. So I've created a new entry with that definition

"strings": {
    "patterns": [{
            "name": "string.quoted.double.myprecious",
            "begin": "\"",
            "end": "\"",
            "patterns": [{
                "name": "constant.character.escape.myprecious",
                "match": "\\\\."
            }]
        },
        {
            "name": "string.quoted.double.myprecious",
            "begin": "PAGE_TEXT_PARAGRAPH_START\\s*\\([^)]\\)\\s*",
            "beginCaptures": {
                "0": {
                    "name": "punctuation.definition.string.begin.myprecious"
                }
            },
            "end": "PAGE_TEXT_PARAGRAPH_END",
            "endCaptures": {
                "0": {
                    "name": "punctuation.definition.string.end.myprecious"
                }
            },
            "patterns": [{
                "name": "constant.character.escape.myprecious",
                "match": "\\\\(x[0-9A-Fa-f]{2}|u[0-9A-Fa-f]{4}|u\\{[0-9A-Fa-f]+\\}|[0-2][0-7]{0,2}|3[0-6][0-7]?|37[0-7]?|[4-7][0-7]?|.|$)"
            }]
        }
    ]
}

And it doesn't work. The 2 keywords are still displayed with the color of keywords (which is ok), but the text between the 2 are not colored correctly.

I've tried to "Inspect Editor Tokens and Scopes" in vscode in order to see how vscode sees each token, and the 2 keywords are correctly discovered, but they are not from the scope "punctuation.definition.string.begin/end.myprecious".

And unfortunately the multiline string is not detected, and the text has not the correct highlight.

Just to be clear and precise, if the text is between double-quotes, it works, I get the correct syntax highlight.

I've tried to do that:

"begin": "PAGE_TEXT_PARAGRAPH_START",

And I get the keywords represented with the highlight of the string, and not with the keywords highlight color anymore, but the text inside is correctly represented.

So basically, I would like to keep the keywords with the "keyword" color, and the string inside with the "string" color.

Do you have any idea how I can do that? I would like to avoid writing a language server, and keep it easy.

I hope my question was clear.

Thank you in advance for your help!


Solution

  • Thank you @rioV8, thanks to your comment, I could find the solution I was looking for.

    I've changed my definition like that:

    {
        "contentName":"string.quoted.double.myprecious",
        "begin": "(PAGE_TEXT_PARAGRAPH_START)\\s*\\([^)]+\\)\\s*",
        "beginCaptures": {
            "1": {
                "name":"keyword.control.myprecious"
            }
        },
        "end": "PAGE_TEXT_PARAGRAPH_END",
        "endCaptures": {
            "0": {
                "name": "keyword.control.myprecious"
            }
        },
        "patterns": [{
            "name": "constant.character.escape.myprecious",
            "match": "\\\\(x[0-9A-Fa-f]{2}|u[0-9A-Fa-f]{4}|u\\{[0-9A-Fa-f]+\\}|[0-2][0-7]{0,2}|3[0-6][0-7]?|37[0-7]?|[4-7][0-7]?|.|$)"
        }]
    }
    

    And it works as I was expecting it :-D